Stata commands

NOTE: This is a very brief summary of the commands covered in class. You absolutely must have a look at the online help for the command you need to figure out (whelp command), and you should consult the manual for a more extensive understanding of how a given command works, as only very basic usage is given here.

The notation sort of follows the Stata convention. The commands are set in typewriter font and are to be typed exactly as spelled (or shorthanded as shown by the underlined text), the entries to be filled by the user are shown in italics, and optional element are put into [square brackets].

Stata guts and concepts

Color: context

Syntax: on-line, context

Data types and storage formats: on-line, context.

Missing values: on-line, context: 1, 2.

_n and _N: on-line, context

Variable list: on-line, context: 1

Condition qualifiers: on-line in, if, context: 1, 2, 3

by : construct: online, context: 1

Return values: return, ereturn

Stata commands by type

(scroll down for commands in alphabetical order)

Interface and usability

Help and search: help, whelp, search, findit

Log files: log, cmdlog

Hand calculator: display


Data handling

File operations: clear, use, ls, save, infile, outfile, sysuse, preserve, restore

Operations on variables: generate, label, replace, egen, mvencode, mvdecode, keep, drop, range, recode

Memory: memory, set memory, compress

Looking at data: describe, list, browse, compare, count

Sorting: sort, gsort, aorder, order, move

Labels and notes

Basic summaries

Means, variances, medians, percentiles: summarize.

Tabulations: tabulate, table.

Correlations, covariances: correlate, pwcorr.

Other: inspect, lv



Box plots

Scatter plots: scatter, twoway

Estimation routines

The summary of the estimation commands: online, context.

Post-estimation commands: test, testnl, lincom, nlcom, predict, ereturn

Basic methods: regress, boxcox

Regression diagnostics: online, context; commands: hettest (heteroskedasticity), ovtest (nonlinearity), imtest (distribution of the regression errors), dwstat (Durbin-Watson test), archlm (LM test for ARCH), vif (collinearity)


Executing do-files: do, run Output: capture, display, quietly, noisily, more

Macros: local, global

Cycles: foreach, forvalues

Stata commands in alphabetical order

aorder [varlist]
Sorts the variables in varlist in alphabetic order and moves them to the front of the dataset.
On-line, context

archlm [, lags(#)]
Lagrange multiplier test for autoregressive conditional heteroskedasticity
On-line, context

avplot [varlist]
Plots the added variable plot of dependent variable vs. variables in varlist, one by one, conditional on other regressors.
Available only after regress
avplots gives added variable plots for all regressors in the model.
On-line, context

browse [varlist] [if exp]
Browse the data
On-line, context

capture Stata command
Executes a command suppressing its output and proceeds further regardless of the error status
If you need to see the output and the error message, enter capture noisily.
On-line, context

cd directory
Change the current directory.
Note that you can use both slash (/) and backslash (\) under Windows, but only slash (/) under Unix.
Put quotes around the full directory name if it contains spaces.
On-line, context

Clears Stata memory. USE WITH CAUTION!
On-line, context

cmdlog using filename
cmdlog close
Logs all the commands issued by the user
On-line, context

compare varname1 varname2 [if exp] [in range]
Compares the values of two variables
On-line, context

compress [varlist]
Reduces the amount of memory needed for the data by bringing it to the smallest storage type needed.
On-line, context

correlate [varlist] [if exp] [in range]
Computes the correlation between two or more variables based on the subset of the observations that are available for all variables.
On-line, context

count [if exp] [in range]
Shows the number of observations satisfying an if/in criteria
On-line, context

describe, [ short]
Shows size of the data set, the number of observations, the variables in the data set, their types and labels
On-line, context

display exp
Evaluates an expression and outputs the result
On-line, context: 1

do filename [arguments] , [nostop]
Executes the specified do-file.
nostop allows to continue execution even if an error occurs.
On-line, context: 1, 2

drop varlist
Deletes specified variables from the current data set in memory.
On-line, context

drop [if exp] [in range]
Deletes specified observations from the current data set in memory.
On-line, context

Performs Durbin-Watson test of residual autocorrelation following regress
The data must be tsset
On-line, context

egen [type] newvar = fcn(arguments) [if exp] [in range], [options]
Extensions to generate.
On-line, context

ereturn list
Shows the stored results of the previous estimation command
On-line, context

exit, [clear]
Exit Stata
clear option shows your understanding that it is OK to lose unsaved data.
On-line, context

findit text
Searches for text in the help files available on the current machine, and over the Internet.
On-line, context

foreach lclname of [ varlist | numlist | local macro | global macro ] {
foreach lclname in arbitrary_list {
Performs the action in the body of the cycle going over all values of `lclname' from the specified list or the macro.
On-line, context

forvalues lclname = range {
Performs the action in the body of the cycle going over all values of `lclname' from the specified range.
If an arbitrary numlist is to be used, see foreach.
On-line, context

generate [type] newvarname=exp [if exp] [in range]
Creates a new variable and set it equal to exp, and to missing otherwise
On-line, context

global gblname string
Creates a global macro gblname and copies string to it.
global gblname = exp
Creates a global macro gblname, evaluates exp and copies the result to gblname.
The local macros are referred to as $gblname.
The ambiguities in the global macro names are resolved by putting { } where needed.
, context

graph box variable [if exp] [in range], [by(varlist) ]
Draws a box-and-whisker plot of the data
On-line, context

gsort [+|-variable] ...
Generalized sorting in both ascending and descending order
On-line, context

help command
Displays help on the specified command.
On-line, context

hettest [varlist]
Lagrangian multiplier test for heteroskedasticity; only available after regress
On-line, context

histogram variable [if exp] [in range] , [discrete width(#) bin(#) start(#) density fraction frequency ...]
Draws a histogram for a single variable. Look through the help file for relevant options.
On-line, context

Information matrix test on the residual distribution
On-line, context

infile variables using filename, [clear]
Reads data from the raw text file. You can specify a dictionary for complex tasks.
On-line, context

inspect variable [if exp] [in range]
Gives a small histogram, the number of values that are: unique; positive, zero, negative; integer and non-integer; missing.
On-line, context

keep varlist
Keeps specified variables and deletes others from the current data set in memory.
On-line, context

keep [if exp] [in range]
Keeps specified observations and deletes others from the current data set in memory.
On-line, context

label variable [varname"text"
Gives a variable a label that is shown in the Variables window, in the output of describe, tabulate, and in graphs
On-line, context

label define labelname # "text" ...
label values varname [labelname]
Defines a set of labelled values, and applies this set to the specified variable
On-line, context

list [variables] [if exp] [in range]
Shows the entries of the data set for the specified variables (for all variables by default) and specified observations (all observations by default).
On-line, context

local lclname string
Creates a local macro lclname and copied string to it.
local lclname = exp
Creates a local macro lclname, evaluates exp and copies the result to lclname.
The local macros are referred to as `lclname'
On-line, context

log using filename
log close
Logs Stata output
On-line, context

ls [filename(s)]
Lists the files in the current directory
On-line, context

lv variable [if exp] [in range]
Letter values of a variable to break the distribution into quintiles, deciles, etc., and visually assess normality.
On-line, context

set memory #m
The former displays available memory, and the latter changes the amount of memory Stata can use
On-line, context: 1, 2

set more [on|off]
Makes Stata stop and wait for user to press a key
On-line, context

move varname1 varname2
Moves varname1 to the front of the data set, and shifts the remaining variables, including varname2, to make room.
On-line, context

mvdecode varlist [if exp] [in range] , mv(numlist ...)
Changes occurrences of numlist to a missing value code. mv() is required.
On-line, context

mvencode varlist [if exp] [in range] , mv(#...) [override]
Changes missings to specified number(s). mv() is required.
Without override, mvencode refuses to make any changes if the numeric values already exist in the varlist
On-line, context

noisily Stata command
Turns back the output of the command cancelling the effect of quietly or capture.
On-line, context

notes [variable"text"
Adds notes to the whole data set or to a particular variable
On-line, context

order varlist
Moves the specified variables to the front of the data set.
On-line, context

outfile [variables using] filename, [replace]
Writes the specified variables (or all of the data)
On-line, context

ovtest , [rhs]
Test for omitted nonlinearity
On-line, context

predict newvarname [if exp] [in range] , options
A universal post-estimation command to obtain observation level results of an estimation procedure, such as fitted values and residuals for regress, or predicted probabilities for probit. The supported options / statistics are provided in help files for the original estimation commands.
On-line, context

Temporarily saves your current data set in memory, to be restored later.
On-line, context

probit depvar [varlist] [if exp] [in range], [robust cluster(varname) ...]
Estimates the probit regression of depvar on varlist.
robust and cluster options provide corrections of the estimates covariance matrix
predict options: p for the probability of a positive outcome (default); xb for fitted values; stdp for the standard error of the prediction.
On-line, context

pwcorr [varlist] [if exp] [in range] , [sig obs ...]
Computes pairwise correlations.
sig option requests significance of zero correlation testing
obs option requests the number of observations on which the correlation is based
On-line, context

Displays the current directory.
On-line, context

quietly Stata command
Suppresses Stata output from the command, except for error messages.
Unlike capture, stops for errors.
On-line, context

range varname #first #last[#obs]
Generates a numerical range / grid of points
On-line, context

recode [varlist] (rule) ..., generate(newvarlist)
Changes the values of numeric variables according to the specified rules.
rule is of the form
numlist | nonmissing | missing = #
On-line, context

regress depvar [varlist] [if exp] [in range], [robust cluster(varname) ...]
Estimates the linear regression of depvar on varlist.
robust and cluster options provide corrections of the estimates covariance matrix
predict options: residuals for the residuals; xb for fitted values (default); stdp for the standard error of the prediction; etc.
On-line, context

replace varname =exp [if exp] [in range]
Changes the entries of an existing variable
On-line, context

restore, [not preserve]
Restores the data that was previously preserved.
not instructs to cancel the previous preserve.
preserve instructs to continue keeping the preserved data for further restoration.
On-line, context

return list
Shows the results of a non-estimation command, if applicable
On-line, context

run filename [arguments] , [nostop]
Executes the specified do-file suppressing the output, except for erors.
nostop allows to continue execution even if an error occurs.
On-line, context

sample # [if exp] [in range], [count]
Takes a subsample of the data, either # per cent of the original data, or # observations if count option is specified.
On-line, context

save filename, [replace]
Saves the current data set in Stata format
On-line, context

scatter y-variable(s) x-variable [if exp] [in range], [formatting options]
Scatter plot of one variable against another, or several y-variables against one x-variable.
The list of the formatting options is huge and will include options on the marker shape, size and color, axes, space of the graph, title, and many other things.
Study the online help for details.
On-line, context

search phrase
Searches Stata help and online resources for phrase.
On-line, context

sort varlist
Sorts the observations in ascending order of variables in varlist
On-line, context

summarize [varlist] [if exp] in [range], [detail]
Descriptive statistics: means, medians, standard deviations, variances.
detail option gives percentiles, skewness and kurtosis.
On-line, context

sysuse filename
Loads a data set that comes with Stata
On-line, context

table var2 [var2] [if exp] [in range] , [contents(...)] ...
Makes a table of summary statistics defined by contents, including frequencies, means, counts, etc. of other variables by var1 or var1 and var2.
On-line, context

tabulate var2 [var2] [if exp] [in range ], [row column nolabel ... ]
One- and two-way frequency tables, with optional row and column summaries/frequencies, and independence tests.
On-line, context

tsset [panelvar] timevar
Declares the data to be of the time series format with timevar representing time variable.
For panel data, panelvar identifies panels (individuals, enterprises, countries, ...), and timevar indicates time within those panels.
The data set will be sorted by timevar or by panelvar timevar

twoway plot [if exp] [in range], twoway options
Two way graphs, including scatter plots, line plots, bar plots, histograms, and smoothing / trend line plots.
May be used as a wrapper to combine several scatter plots.
Options control axes, labels, title, legends, etc.
On-line, context

use filename
Loads the Stata data file into Stata memory
On-line, context

Displays the variance-covariance matrix of the estimated coefficients
On-line, context

Variance inflation factors
On-line, context

whelp command
Displays help on the specified command.
On-line, context