All data projects can benefit from building a Data Management Plan (“DMP”) before the project begins.  Typically a DMP is a formal document that describes your data and what your team will do with it during and after the data project.

There is no cookie-cutter DMP that is right for every project, but in most cases the following questions should be addressed in your DMP:

  1. What kind of data will your project analyze?  What file formats and software packages will you use?  What will your data output be?  How will you collect and process the data?
  2. How will you document and organize your data?  What metadata will you collect?  What standards and formats will you use?
  3. What are your plans for data access within your team?  What are the roles that the individuals in your team will play in the data analysis process?  How will you address any privacy or ethical issues, if applicable?
  4. What are your plans for long term archiving?  What file formats will you archive the data in?  Who will be responsible for the data after the project is complete?  Where will you save the files?
  5. What outside resources do you need for your project?  How much time will the project take your team to complete and audit?  How much will it cost?

When working on any type of data project, planning ahead is a crucial step.  Before starting in on a project, it’s important to think through as many of the details as possible so you can budget enough time and resources to accomplish all of the objectives.  As a matter of fact, some organizations and government entities require a Data Management Plan (“DMP”) to be in place in all of their projects.


A DMP is a formal document that describes the data and what your team will do with it during and after the data project.  Many organizations and agencies require one, and each entity has specific requirements for their DMPs.


DMPs can be created in just a simple readme.txt file, or can be as detailed as DMPs tailored to specific disciplines using online templates such as  The DMPTool is designed to help create ready-to-use data management plans.

Who is Doug Berg, Ph.D.?

Posted by Matt Rigling and Susie Wirtanen | Big Data, Economics, Statistical Analysis

Doug Berg, Ph.D., is an expert in big data, and has been working with EmployStats and Principal Economist Dr. Dwight Steward for several years regarding class action and discrimination lawsuits.  Dr. Berg is currently a professor at Sam Houston State University in the Department of Economics.  He received his Bachelor’s degree in Accounting from the University of Minnesota, and his Ph.D. in Economics from Texas A&M University.  Dr. Berg will provide additional support and his expert insight into using big data in employment litigation.  Doug Berg, Ph.D., describes litigation as “living on data”, and the better the data, the better the argument.  EmployStats welcomes his insight into the underlying meaning behind the data our clients provide us!

Using Big Data Analytics in Litigation

Posted by Matt Rigling and Susie Wirtanen | Big Data, Statistical Analysis

Due to the massive computational requirements of analyzing big data, trying to find the best approach to big data projects can be a daunting task for most individuals.  At EmployStats, our team of experts utilize top of the line data systems and software to seamlessly analyze big data and provide our clients with high quality analysis as efficiently as possible.

  1. The general approach for big data analytics begins with fully understanding the data provided as a whole.  Not only must the variable fields in the data be identified, but one must also understand what these variables represent and determine what values are reasonable for each variable in the data set.  
  2. Next, the data must be cleaned and reorganized into the clearest format, ensuring that data values are not missing and are within reasonable ranges of certainty.  As the size of the data increases, the amount of work necessary to clean the data increases.  In larger datasets there are more individual components which are typically dependent on each other, therefore it is necessary to write computer programs to evaluate the accuracy of the data.
  3. Once the entire dataset has been cleaned and properly formatted, one needs to define the question that will be answered with the data.  One must look at the data and see how it relates to the question.  The questions for big data projects may be related to frequencies, probabilities, economic models, or any number of statistical properties.  Whatever it is, one must then process the data in the context of the question at hand.
  4. Once the answer has been obtained, one must determine that the answer is a strong answer.  A delicate answer, or one that would significantly change if the technique of the analysis was altered, is not ideal.  The goal of big data analytics is to have a robust answer, and one must try to attack the same question in a number of different ways in order to build confidence in the answer.

What is Big Data in Litigation?

Posted by Matt Rigling and Susie Wirtanen | Big Data, Statistical Analysis

Big data is not simply a size, it is a way of describing the type of data tools that will be utilized for an analysis.  Most, if not all, of the big data we work with at EmployStats requires specific data tools that are ever changing and evolving, as well as new tools that are being introduced into the market constantly.  

Each avenue will handle big data differently, and offer specific benefits that will determine how an analysis will be performed, as well as how results will be interpreted.  EmployStats constantly keeps up to date with the latest and greatest data analytic software for large data sets in order to optimize the outcome of these types of analyses.  

Many recent cases such as United States of America v. Abbott Laboratories and Pompliano v. Snapchat have utilized big data analysis techniques in litigation, proving that not only is it common to use big data in litigation, it is necessary to bring many cases to a successful close.