Big Data & Document Analytics
Big Data Analytics

Big data may exceed or strain the memory and hard disk capacity of traditional computing platforms, requiring more advanced techniques for storing, manipulating, visualizing, and summarizing such large datasets. Even if the data can fit within the capacity limits of a traditional computing platform, the processing time needed to perform analysis may limit the scope of feasible analysis and the productivity of experts. Using a specialized big data platform, such as a Hadoop cluster, makes complex analysis of large datasets feasible, timely, and more efficient.

We have experience deploying our own Hadoop clusters and other terabyte-scale data storage solutions, and working with third party hosting companies to provide secure remote access. The programming languages we most commonly use in conjunction with the Hadoop platform are Python, R, SQL, and Pig. We use these languages to transform, extract, and sample low-level transactional data to produce datasets more suitable for traditional analysis tools such as Excel or Stata. Our expertise and experience in deploying Hadoop clusters provides Brattle with a competitive advantage in extracting from raw data the economic insights of critical importance to each case.

Datasets often requiring “Big Data” tools

  • Bid/offer and executed trade records for stock market, foreign exchange, commodity, and other financial markets
  • Payment card transaction records
  • Mortgage-backed security due diligence and performance records
  • Airline and air cargo industry records
  • Railroad freight system cargo manifests
  • Pharmaceutical and healthcare records
  • Auto parts inventory and sale records
  • Life insurance policy and claim records
  • Pension and annuities recordkeeping
  • Hedge fund, bond fund, and mutual fund portfolio and trading analysis
  • Geographic tracking of customers, product shipments
  • Retail point-of-sale scanner data