In the text below, I discuss how exactly I will identify the corpus – my selected sample of texts from all text printed in the study period in all national newspapers Topic is not important here, what is important is how would you define my sample?
I want to research how the national newspapers of Kazakhstan covered the riots of oil industry workers which happened last year. My hypothesis suggests that newspapers which publish governmentally sponsored articles tend to share the President`s position on the issue, while newspapers that lack such support tend to support the oil workers`. Therefore my DV is the «position of newspaper». The measurement of the DV is: pro-President / neutral / pro-workers positions. I will also test other factors that might influence the position of the newspaper. The first IV is the proportion of governmental support to the annual newspaper`s budget measured in percentage on a continuous scale. The second IV is the correlation between the number of circulation and the newspaper`s position. The third IV is the location of the main newspaper office (capital or the second largest city). The 4th IV is the language of the newspaper (Russian, Kazakh or both). The fifth IV is dichotomous variable indicating whether the newspaper has its web-site.
Although it is possible to conduct the content analysis employing the design of simple random sampling procedure, it will be too costly to do. Therefore, I will rely on specific types of random samples. Since national newspapers are being issued only in two major cities in Kazakhstan (capital and ex-capital) they will fall under consideration. However, only newspapers covering socio-political aspects will be included in sample and purely advertising or thematically specialized newspapers will be excluded.
Since national newspapers are being issued only in two major cities (capital and ex-capital) they will fall under consideration. However, only newspapers covering socio-political aspects will be included in sample and purely advertising or thematically specialized newspapers will be excluded.
Firstly, I will focus on the newspapers issued during 3 months followed after the riots. Secondly, having the list of all newspapers` titles I will split them into two subgroups: daily and weekly newspapers. Thirdly, I will select newspapers proportionally to the size of each subgroup. I will pick up roughly 30% of newspapers from each pile, in other words, every third newspaper from each subgroup will be selected after newspaper’s arranged accordingly to the alphabetical order of their titles. The following step is the reading of all articles in each selected issue and identifying articles dedicated directly or indirectly to the riots.
Finally, each relevant article will be assigned with an order number in accordance with article`s appearance in the newspapers starting from the first page of it. The last stage that ensures random character of the selection is selecting 95% of articles dedicated to riots relying on the table of random numbers.
Is anything missing here in my sampling design? Did I identify anything incorrectly in my sampling procedure?