Skip to content

Economic Thinking In Action

  • Visit EmployStats.com

Recent Posts

  • EmployStats Welcomes Christian Adams
  • EmployStats Welcomes Ruth Robinson
  • EmployStats Assist with the Rescission of the Texas Medicaid Waiver Program
  • Calculating Damages and Customizing Labor Market Data
  • Gathering Data for Labor Market and Mitigation Studies

Recent Comments

    Archives

    • September 2022
    • August 2022
    • June 2022
    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • August 2021
    • May 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • January 2020
    • October 2019
    • September 2019
    • August 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • May 2016
    • April 2016
    • March 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
    • October 2014
    • September 2014
    • August 2014
    • July 2014
    • June 2014
    • May 2014
    • April 2014
    • March 2014
    • February 2014
    • January 2014
    • December 2013
    • November 2013
    • October 2013
    • November 2011
    • October 2011
    • August 2011

    Categories

    • 401k Industry
    • Academic
    • Big Data
    • BLS Data
    • Business Damages
    • Credit Markets
    • Current Events and News
    • Data Analytics
    • Earnings
    • Economics
    • Employment
    • Industry
    • Industry trends
    • Interest Rates
    • Job openings
    • Labor data
    • Pay day lending
    • Personal Injury
    • Statistical Analysis
    • Texas Economy
    • U.S. Economy
    • Uncategorized
    • Wage and hour cases
    • Whistleblower

    Meta

    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org

    Recent Posts

    • EmployStats Welcomes Christian Adams
    • EmployStats Welcomes Ruth Robinson
    • EmployStats Assist with the Rescission of the Texas Medicaid Waiver Program
    • Calculating Damages and Customizing Labor Market Data
    • Gathering Data for Labor Market and Mitigation Studies

    Recent Comments

      Archives

      • September 2022
      • August 2022
      • June 2022
      • May 2022
      • April 2022
      • March 2022
      • February 2022
      • January 2022
      • December 2021
      • November 2021
      • August 2021
      • May 2021
      • March 2021
      • February 2021
      • January 2021
      • December 2020
      • November 2020
      • October 2020
      • September 2020
      • August 2020
      • July 2020
      • June 2020
      • May 2020
      • April 2020
      • January 2020
      • October 2019
      • September 2019
      • August 2019
      • May 2019
      • April 2019
      • March 2019
      • February 2019
      • September 2017
      • August 2017
      • July 2017
      • June 2017
      • May 2017
      • April 2017
      • May 2016
      • April 2016
      • March 2016
      • January 2016
      • December 2015
      • November 2015
      • October 2015
      • September 2015
      • August 2015
      • July 2015
      • June 2015
      • May 2015
      • April 2015
      • March 2015
      • February 2015
      • January 2015
      • December 2014
      • November 2014
      • October 2014
      • September 2014
      • August 2014
      • July 2014
      • June 2014
      • May 2014
      • April 2014
      • March 2014
      • February 2014
      • January 2014
      • December 2013
      • November 2013
      • October 2013
      • November 2011
      • October 2011
      • August 2011

      Categories

      • 401k Industry
      • Academic
      • Big Data
      • BLS Data
      • Business Damages
      • Credit Markets
      • Current Events and News
      • Data Analytics
      • Earnings
      • Economics
      • Employment
      • Industry
      • Industry trends
      • Interest Rates
      • Job openings
      • Labor data
      • Pay day lending
      • Personal Injury
      • Statistical Analysis
      • Texas Economy
      • U.S. Economy
      • Uncategorized
      • Wage and hour cases
      • Whistleblower

      Meta

      • Log in
      • Entries feed
      • Comments feed
      • WordPress.org

      Tag: e-discovery

      Data Analytics and the Law: Unstructured Data

      Data Analytics and the Law: Unstructured Data

      Data analytics is only beginning to tap into the unstructured data which forms the bulk of everyday life. Text messages, emails, maps, audio files, PDF files, pictures, blog posts, these sources represent ‘unstructured data,’ as opposed to the structured data sources mentioned thus far. Up to 80% of all enterprise data is unstructured. So, how can a client’s text messages or recorded phone calls be analyzed like a SQL table? Unstructured data is not easily stored into pre-defined models or schema; some CRM tools (e.x. Salesforce) do store text-based fields. But typically, documents do not lend themselves to traditional queries from a database. This does not mean ‘structured’ and ‘unstructured’ data are in conflict with each other.

       

      Document based evidence is of course, an integral part of the legal system. Lawyers and law offices now have access to comprehensive e-discovery programs, which sift through millions of documents based on keywords and terms. Selecting relevant information to prove a case is nothing new. The intersection with Data Analytics arises when hundreds of thousands or millions of text based data are analyzed as a whole, to prove an assertion in court.

       

      Turning unstructured text into analyzable, structured data is made possible by increasingly sophisticated methods. Some machine learning algorithms, for example, analyze pictures and pick up on repeating patterns. Text mining programs scrape PDFs, websites, and social media for content, and then download the text into preassigned columns and variables. Analyses can be run, for example, on the positivity or negativity of a sentence, the frequency of certain words, or the correlation of certain phrases to one another. Natural language processing (NLP) includes speech recognition, which itself has seen significant progress in the past two decades. Analytics on unstructured data is now more useful in producing relevant evidence.

       

      As important as the unstructured data is its corresponding Metadata: data that describes data. A text message or email contains additional information about itself: for example the author, the recipient, the time, and the length of the message. These bits of information can be stored in a structured data set, without any reference to the original content, and then analyzed. For example, a company has metadata on electronic documents at specific points in a transaction’s life-cycle; running a pattern analysis on this metadata could identify whether or not certain documents were made, altered, or destroyed after an event.

       

      In instances of high profile fraud, such as the London Inter-bank Offered Rate (LIBOR) manipulation scandal, prolific emails and text messages between traders added a new dimension to the regulator’s cases against major banks. Overwhelming and repeated textual evidence, which can be produced through analyses on unstructured data, is yet another tool for litigating parties to prove a pattern of misconduct.

      Posted on March 6, 2019Author Carl McClainCategories Big Data, Data Analytics, WhistleblowerTags Data Analytics, e-discovery, fraud, Litigation, metadata, pattern analysis, Speech-to-Text, whistleblower
      Proudly powered by WordPress