Stata is a software product created by the StataCorp in 1985. It is a general-purpose statistical software package. It assures a fast, convenient, accurate tool for data handling, data manipulation, visualization, statistical modeling, and reproducible reporting. It is primarily used by researchers and academicians in various fields including but not limited to economics, social sciences, biomedicine etc. Stata provides a very user-friendly interface having different options through its menu feature for carrying out analysis of the data. It has therefore found imminence in dissertation work of researchers owing to its multiple capabilities of conducting data research work and producing a concise yet powerful report in different convenient formats such as MS Word document, MS Excel etc. for final consumption and sharing with others. It also provides both a graphical user interface (GUI) as well as a command window for custom coding, thus making the use of this software more intuitive for the user.
Important Features of STATA for dissertation work
Fig. 1: Data Editor of Stata tool (adapted from ww.stata.com)
Stata’s data management feature gives extensive control to the user. It can import and export the data into different formats and can also manage data coming from different databases simultaneously. It also provides functionalities to sort, join, match, merge, create and append data. Therefore, a raw data available from a source can be easily manipulated for further analysis. In addition to that, new data when available can be easily appended to the existing data. This reduces additional effort and duplication of data manipulation when new data is added to an existing dataset. Stata can process both text and numerical data and list it in the usual spreadsheet format which most users are comfortable with. Finally, on the data front, it can handle billions of rows of data or observations and can simultaneously accommodate hundreds of thousands of variables. A representative GUI of the data view in Stata is shown in Figure 1.
Statistical features and graphical capabilities in Stata
Stata allows the user to either refer to the menu item or write custom codes for performing an analysis of the data. There are primarily two data tabs in Stata: Statistics and Graphics. Within the “Statistics” option there are a total of 21 sub-tabs listing a plethora of options for statistical analysis. Some of these are summary statistics, basic statistics, regression, time series, survival analysis, model analysis, panel data, multilevel model, causal inference, correlation, Bayesian analysis, multiple imputation, LCA, survey methods, multivariate methods, and many more. Within each of these 21 sub-tabs, there are more tabs which gives an extensive option to the user to choose from to conduct an appropriate analysis for the data.
Fig. 2: Word Cloud showing the different options for statistical analysis in Stata (adapted from www.stata.com)
Therefore, irrespective of the nature of dissertation study and the type of data involved there is an appropriate method available to apply on the data for a desired outcome. Similarly, within the “Graphics” tab, there are also a listing of 21 sub-tabs to choose the appropriate display medium. Though this may sound daunting to some to choose from such a wide variety of options, but Stata provides a very helpful guide to sort this problem for the user. In the command window, the user can just type “help principal component analysis” as for example, and a detailed documentation on the principal component analysis method opens to aid the user. Figure 2 is a world cloud demonstration of the numerous statistical methods that are available in Stata. The next figure, Fig. 3 gives a few snapshots of the publication ready graphics that can be extracted from using Stata.
Fig. 3: Few examples of graphical outputs from Stata (adapted from www.stata.com)
Stata has inbuilt reporting features which helps the user to easily incorporate the derived results, graphs and outcomes of the analysis from the Stata tool with tables and formatted text to be directly exported in MS Word, PDF, Excel and HTML formats. These reports are reproducible as Stata’s integrated versioning feature allows the auto updation of the dynamic reports with regard to changes in the data. In case of the written commands, these can be easily extracted to a “do-file” for reusing the same code later. A “do-file” also helps to pass multiple scripts of Stata code all at once to the tool for execution. Stata also allows the user to have more than one “do-file” opened at a time. It is therefore extremely efficient in running repetitive analysis when macros and loops are incorporated in a “do-file”.
Benefits and limitations of the Stata tool in dissertation work
Stata is very well supported by the StataCorp organization as well as documentation on professional communities worldwide. StataCorp sends out updated files and fixes for reported glitches every two months on an average. The professional community also provides an incredible support to the users of Stata. It allows third party written commands to be directly sourced and loaded on the software. Moreover, the integrated versioning feature of Stata allows a user to reuse any codes written in the past in any version of Stata to be easily loaded and used in the latest Stata version. Even if a dataset was created in Stata few decades ago could still be imported in the latest version of Stata and can be used to run analysis. However, one big drawback of Stata is that it needs the user to remain up to date with the latest version of Stata. A user who is trying to open a Stata file (either data file or output file) created with the latest version in any previous version of Stata will not support to be opened. But Stata does not charge any additional fee to the user for upgradation to the latest version or has any add-on charges. The pricing of Stata is also comparatively lower than other similar commercial products found in the market. Hence, it definitely poses itself as a very useful for someone who wants to extensively use for dissertation work.
For more insights: