With businesses and government now firmly reliant on electronic data for their regular operations, litigants are increasingly presenting data-driven analyses to support their assertions of fact in court. This application of Data Analytics, the ability to draw insights from large data sources, is helping courts answer a variety of questions. For example, can a party establish a pattern of wrongdoing based on past transactions? Such evidence is particularly important in litigation involving large volumes of data: business disputes, class actions, fraud, and whistleblower cases. The use cases for data based evidence increasingly cuts across industries, whether its financial services, education, healthcare, or manufacturing.  

 

Given the increasing importance of Big Data and Data Analytics, parties with a greater understanding of data-based evidence have an advantage. Statistical analyses of data can provide judges and juries with information that otherwise would not be known. Electronic data hosted by a party is discoverable, data is impartial (in the abstract), and large data sets can be readily analyzed with increasingly sophisticated techniques. Data based evidence, effectively paired with witness testimony, strengthens a party’s assertion of the facts. Realizing this, litigants engage expert witness to provide dueling tabulations or interpretations of data at trial. As a result, US case law on data based evidence is still evolving. Judges and juries are making important decisions based the validity and correctness of complex and at times contradictory analyses.

 

This series will discuss best practices in applying analytical techniques to complex legal cases, while focusing on important questions which must be answered along the way. Everything, from acquiring data, to preparing an analysis, to running statistical tests, to presenting results, carries huge consequences for the applicability of data based evidence. In cases where both parties employ expert witnesses to analyze thousands if not millions of records, a party’s assertions of fact are easily undermined if their analysis is deemed less relevant or inappropriate. Outcomes may turn on the statistical significance of a result, the relevance of a prior analysis to a certain class, the importance of excluded data, or the rigor of an anomaly detection algorithm. At worst, expert testimony can be dismissed.

 

Many errors in data based evidence, at their heart, are faulty assumptions on what the data can prove. Lawyers and clients may overestimate the relevance of their supporting analysis, or mold data (and assumptions) to fit certain facts. Litigating parties and witnesses must constantly ensure data-driven evidence is grounded on best practices, while addressing the matter at hand. Data analytics is a powerful tool, and is only as good as the user.

All data projects can benefit from building a Data Management Plan (“DMP”) before the project begins.  Typically a DMP is a formal document that describes your data and what your team will do with it during and after the data project.

There is no cookie-cutter DMP that is right for every project, but in most cases the following questions should be addressed in your DMP:

  1. What kind of data will your project analyze?  What file formats and software packages will you use?  What will your data output be?  How will you collect and process the data?
  2. How will you document and organize your data?  What metadata will you collect?  What standards and formats will you use?
  3. What are your plans for data access within your team?  What are the roles that the individuals in your team will play in the data analysis process?  How will you address any privacy or ethical issues, if applicable?
  4. What are your plans for long term archiving?  What file formats will you archive the data in?  Who will be responsible for the data after the project is complete?  Where will you save the files?
  5. What outside resources do you need for your project?  How much time will the project take your team to complete and audit?  How much will it cost?

When working on any type of data project, planning ahead is a crucial step.  Before starting in on a project, it’s important to think through as many of the details as possible so you can budget enough time and resources to accomplish all of the objectives.  As a matter of fact, some organizations and government entities require a Data Management Plan (“DMP”) to be in place in all of their projects.

 

A DMP is a formal document that describes the data and what your team will do with it during and after the data project.  Many organizations and agencies require one, and each entity has specific requirements for their DMPs.

 

DMPs can be created in just a simple readme.txt file, or can be as detailed as DMPs tailored to specific disciplines using online templates such as DMPTool.org.  The DMPTool is designed to help create ready-to-use data management plans.

Dwight Steward, Principal Economist at EmployStats, will be a featured speaker at the upcoming employment law CLE in San Francisco on July 12, 2017.  The CLE will be taking place at the Bently Reserve in downtown San Francisco, CA, and will be discussing the recent California Equal Pay Act.

 

Dwight Steward, Ph.D., is the author of the book Statistical Analysis of Employment Data in Discrimination Lawsuits and EEO Audits.  The statistical guide for attorneys and human resource professionals provides managers and courts with empirical evidence that goes beyond anecdotes and stories.

 

The textbook presents the methodologies that are used in statistical employment data analyses.  The book uses a non-mathematical approach to develop the conceptual framework underlying employment data analyses, so that professionals starting with no background in statistics can easily use this book as a tool in their practice.

 

Visit www.CaliforniaEqualPay2017CLE.com to register to hear directly from Dwight Steward at the July 12th employment law CLE in San Francisco, CA.

 

Interested in purchasing Dwight Steward’s statistical guide? Find it on Amazon at www.amazon.com/Statistical-Analysis-Employment-Discrimination-Lawsuits/dp/0615340504

EmployStats is pleased to announce that employee Matt Rigling has been offered admission into UT Austin’s Masters Program and will begin taking classes in Summer 2017!  Matt will be completing the 18-month program to obtain his Masters Degree in Economics.  He looks forward to taking advanced analytical and econometric courses and bringing those skills to his work here at EmployStats.  Matt will be working full-time in his position of Research Associate at EmployStats while pursuing his advanced degree.

Matt Rigling began working for EmployStats in an Intern position in March 2015, and was promoted to Research Associate in May 2015 after graduating from the University of Texas at Austin with a Bachelors Degree in Economics.  When he’s not crunching numbers for EmployStats, Matt enjoys watching the San Antonio Spurs and going on hikes with his puppy Zella.

Doug Berg, Ph.D., is an expert in big data, and has been working with EmployStats and Principal Economist Dr. Dwight Steward for several years regarding class action and discrimination lawsuits.  Dr. Berg is currently a professor at Sam Houston State University in the Department of Economics.  He received his Bachelor’s degree in Accounting from the University of Minnesota, and his Ph.D. in Economics from Texas A&M University.  Dr. Berg will provide additional support and his expert insight into using big data in employment litigation.  Doug Berg, Ph.D., describes litigation as “living on data”, and the better the data, the better the argument.  EmployStats welcomes his insight into the underlying meaning behind the data our clients provide us!

Due to the massive computational requirements of analyzing big data, trying to find the best approach to big data projects can be a daunting task for most individuals.  At EmployStats, our team of experts utilize top of the line data systems and software to seamlessly analyze big data and provide our clients with high quality analysis as efficiently as possible.

  1. The general approach for big data analytics begins with fully understanding the data provided as a whole.  Not only must the variable fields in the data be identified, but one must also understand what these variables represent and determine what values are reasonable for each variable in the data set.  
  2. Next, the data must be cleaned and reorganized into the clearest format, ensuring that data values are not missing and are within reasonable ranges of certainty.  As the size of the data increases, the amount of work necessary to clean the data increases.  In larger datasets there are more individual components which are typically dependent on each other, therefore it is necessary to write computer programs to evaluate the accuracy of the data.
  3. Once the entire dataset has been cleaned and properly formatted, one needs to define the question that will be answered with the data.  One must look at the data and see how it relates to the question.  The questions for big data projects may be related to frequencies, probabilities, economic models, or any number of statistical properties.  Whatever it is, one must then process the data in the context of the question at hand.
  4. Once the answer has been obtained, one must determine that the answer is a strong answer.  A delicate answer, or one that would significantly change if the technique of the analysis was altered, is not ideal.  The goal of big data analytics is to have a robust answer, and one must try to attack the same question in a number of different ways in order to build confidence in the answer.

Big data is not simply a size, it is a way of describing the type of data tools that will be utilized for an analysis.  Most, if not all, of the big data we work with at EmployStats requires specific data tools that are ever changing and evolving, as well as new tools that are being introduced into the market constantly.  

Each avenue will handle big data differently, and offer specific benefits that will determine how an analysis will be performed, as well as how results will be interpreted.  EmployStats constantly keeps up to date with the latest and greatest data analytic software for large data sets in order to optimize the outcome of these types of analyses.  

Many recent cases such as United States of America v. Abbott Laboratories and Pompliano v. Snapchat have utilized big data analysis techniques in litigation, proving that not only is it common to use big data in litigation, it is necessary to bring many cases to a successful close.

We are joining forces with David Neumark, Ph.D., an expert on labor market discrimination in California, to bring a new air of expertise to the EmployStats team.  Dr. Neumark is the Chancellor’s Professor of Economics at U.C. Irvine, and has previously taught at Michigan State after starting his career at the Federal Reserve.  His primary work has focused on age and race discrimination, researching into new theories, as well as offering expert consulting for these discrimination cases.  Our highly skilled researchers will be providing support for Dr. Neumark in many of his large, complex employment litigation cases.  We are excited to have him on board!

 

EmployStats Research Associate, Susan Wirtanen, recently visited New York, NY to attend a course in Stata.  Stata is a statistical software data analytics tool utilized by EmployStats analysts in almost all of our case work, especially wage and hour, and employment litigation.  The tools Susan learned in attending this training include data management, data manipulation, and tools used for complex analyses.  These skills will allow Susan to work quickly and efficiently through large data sets our clients may provide for analysis.   

Susan Wirtanen was hired at EmployStats in June 2016 as an Intern after graduating from the University of Texas in Austin with a Bachelor’s degree in Economics.  Susan recently began working full-time as a Research Associate at the beginning of 2017.  In addition to being a full-time employee, Susan coaches club volleyball here in Austin, and recently finished her first season of coaching.