local tata " ca de es fi fr it no dk lu nl us il au" foreach gogo in `tata' { if "`gogo'"== "au" local fifi "au85 au89 au95 au01 au03 au10" if "`gogo'"== "ca" local fifi "ca87 ca91 ca94 ca98 ca04 ca10" if "`gogo'"== "de" local fifi "de84 de89 de94 de00 de04 de10" if "`gogo'"== "dk" local fifi "dk87 dk92 dk95 dk00 dk04 dk10" if "`gogo'"== "es" local fifi "es85 es90 es95 es00 es04 es10" if "`gogo'"== "fi" local fifi "fi87 fi91 fi95 fi00 fi04 fi10" if "`gogo'"== "fr" local fifi "fr84 fr89 fr94 fr00 fr05 fr10" if "`gogo'"== "il" local fifi "il86 il92 il97 il01 il05 il10" if "`gogo'"== "it" local fifi "it86 it91 it95 it00 it04 it10" if "`gogo'"== "lu" local fifi "lu85 lu91 lu94 lu00 lu04 lu10" if "`gogo'"== "nl" local fifi "nl83 nl90 nl93 nl99 nl04 nl10" if "`gogo'"== "no" local fifi "no86 no91 no95 no00 no04 no10" if "`gogo'"== "us" local fifi "us86 us91 us94 us00 us04 us10" *toto is the country year, e.g. the first one will be at87, the last one will be us04 and Stata will go through all of them. By doing so, we create our own temporary dataset. foreach toto in `fifi' { * Problem: for each country-year, you have a personal and a household-variable. E.g.: "$fr84bp" is the a $fr84bp.dta file for France 84b survey with the personal variables. So you have to get the personal and the household variables out. local perso "$`toto'p" //this part of the loop it extracts a temporary file with personal information for each country-year local house "$`toto'h" //this creates a separate file with the household variables for each country-year * qui use hid ppopwgt age sex relation educ* immigr nchildren educ_c ethnic_c partner pil* ptime* emp* cname using `perso' , clear //Now we quietly use the variables that follow, using first the personal file of each country-year and clearing the info from the prior file. Example for variables: ppopwgt increases the population to the population of the country, e.g. in France there are 60 million people, in the US 300 etc. qui joinby hid using `house' //Now we merge the variables that we retained from the personal dataset to the entire household dataset. keep hid ppopwgt age sex relation educ* year iso2 hpopwgt immigr ethnic_c dhi nchildren educ_c nhhmem emp* partner region_c pi pil ptime* cname // Now we keep the variables that we already got from the personal datafile and the variables from the household datafile. * Explanation for some variables: relation= relation of individual to household head, deflat=bad LIS deflator for income, but does not work mostly local save "u`toto'" //Puts into a variable, of which the name is "save" the string t`toto', e.g. the dataset tat87.dta with the variables that we want. qui save `save' , replace //Before we just defined the local, now we actually safe. } } //Now we close the loop, for each country we have five different years with all the info that we need, but we still need to merge each country and year into one file. clear foreach gogo in `tata' { if "`gogo'"== "au" local fifi "au85 au89 au95 au01 au03 au10" if "`gogo'"== "ca" local fifi "ca87 ca91 ca94 ca98 ca04 ca10" if "`gogo'"== "de" local fifi "de84 de89 de94 de00 de04 de10" if "`gogo'"== "dk" local fifi "dk87 dk92 dk95 dk00 dk04 dk10" if "`gogo'"== "es" local fifi "es85 es90 es95 es00 es04 es10" if "`gogo'"== "fi" local fifi "fi87 fi91 fi95 fi00 fi04 fi10" if "`gogo'"== "fr" local fifi "fr84 fr89 fr94 fr00 fr05 fr10" if "`gogo'"== "il" local fifi "il86 il92 il97 il01 il05 il10" if "`gogo'"== "it" local fifi "it86 it91 it95 it00 it04 it10" if "`gogo'"== "lu" local fifi "lu85 lu91 lu94 lu00 lu04 lu10" if "`gogo'"== "nl" local fifi "nl83 nl90 nl93 nl99 nl04 nl10" if "`gogo'"== "no" local fifi "no86 no91 no95 no00 no04 no10" foreach toto in `fifi' { //We have five separate files per country and they must be aggragated into a single one local save "u`toto'" //now it safes e.g. "tat87" "tat94" etc. up to the point where every files are for one country, we are still and will still be till the end looking at one country qui append using `save' //"save" is simply the file name to which everything is safed } } save $mydata/prosoc/total12pnew , replace su * use $mydata/prosoc/total12pnew , replace su * gen Educlev=educlev drop educlev *Now we recode the years so that we have them in five-year intervals qui recode year (1968/1971=1970) (1972/1976=1975) (1977/1982=1980) (1983/1987=1985) (1988/1992=1990) (1993/1997=1995) (1998/2002=2000) (2003/2006=2005) (2007/2012=2010), gen(year5) //We recode the years into five-year intervals * Generates the weights that we need qui gen pweight = int(ppop) *Here we assign everyone to a five-year interval, always to the lowest one 26=25 and 29=25 gen page=floor(age/5)*5 gen age5=floor(age/5)*5 qui keep if age5>=25 & age5<=65 qui keep if year5>=1985 & year5 <=2010 //whatever age we would like to limit ourselves to, repeated in regression merge m:1 cname year using $myincl//ppp.dta // we add ppp data already avilable at the LIS in a separate dataset ta _m keep if _m ==3 ta lisppp ** (log ) of equivalized DHI per standard adult gen pppdhi=dhi/lisppp gen eqdhi=pppdhi/sqrt(nhhmem) gen leqdhi=ln(eqdhi) ta iso2 year5 [fw=pweight] , s(ldhi) nofreq nost noobs w gen female=sex==2 gen ter=Educlev>=311 if Educlev!=. // tertiary education dummy ta iso2 year5 keep nhhmem Educlev year5 pweight page iso2 female partner immigr page ter leqdhi su * save $mydata/prosoc/APC_WS.dta , replace