****these lines just once to build the large dataset ****these lines just once to build the large dataset => saved as "$mydata/prosoc/tmplis15" *create the extract dataset from lis* *we have more variables than needed but ...* *we keep a set of 16 advanced countries on the time span 2010-today* lissyuse, pvars(dname hid ppopwgt age sex relation) /// hvars(hid cname year iso2 hpopwgt dhi hifactor nhhmem region_c ) /// lis iso2(at au be ca de dk es fi fr il it lu nl no uk us) from(2010) to(2020) save "$mydata/prosoc/tmplis15" , replace ****these lines just once to build the large dataset ****these lines just once to build the large dataset ****since the extract "$mydata/prosoc/tmplis15" already exists we can begin with this... use "$mydata/prosoc/tmplis15" , replace *useful variables such as country-year code, equivalised pre and post distribution household HH income... gen ccyyyy=iso2+string(year) gen gdhi=. gen ghif=. gen edhi=dhi/sqrt(nhhmem) gen ehif=hif/sqrt(nhhmem) * we keep only the latest year for each country ... bysort iso2: egen maxy=max(year) keep if year==maxy drop maxy *tricks: here is a series of standard treatment for LIS key figures compatibility... *drop "bad" disposable incomes drop if dhi==. | dhi==0 * select only if there is a weight gen hwgt=hpopwgt drop if hwgt==. | hwgt==0 * trick: create person weight as hwgt times number of household member * trick: this weight is non-zero and works "almost all the time" as frequency weight * BE CAREFUL = ONLY DESCRIPTIVE WEIGHT, NOT TO BE USED FOR REGRESSIONS, MODELS, etc.. generate wt=int(hwgt*nhhmem *100) *keep head of household HH, in "non elderly HH" keep if relation==1000 keep if age<65 *for each country-year we compute edhi equivalised disposable hh income *same thing for ghif, = factor income = before redistribution *trick: recode ccyyyy from 1 to max, max= total nb of countries-years... encode ccyyyy, gen(nname) su nname local max=r(max) *trick: we make use of the lis-key-figures top and bottom codings... *gini is evaluated through the SJ+PVK ineqdec0 command ... *edhi income gini is then saved in a variable= forvalues i=1/`max' { quietly { sum edhi [w=wt] if nname==`i' generate botlin=0.01*_result(3) if nname==`i' replace edhi=botlin if edhitoplin & nname==`i' drop botlin toplin ineqdec0 edhi [w=wt] if nname==`i' replace gdhi= r(gini) if nname==`i' ineqdec0 ehif [w=wt] if nname==`i' replace ghif= r(gini) if nname==`i' } } *we then collapse by ccyyyy collapse gdhi ghif, by(ccyyyy) *we then graph the histograms *(plenty of tricks to order the ccyyyy) graph hbar ghif gdhi , over(ccyy, sort(2) descending) scale(.6) blabel(bar, position(outside) format(%5.2f)) legend(off) *important: this command to have the graph on lissy... graphexportpdf $mypdf/graphteststata.pdf, replace