Big Data, Better Conferences: From Strata to Visualized, 6 Save-the-Dates

big data conferences[Image via DataGotham]

The next few months are a Northeastern data-fiend’s dream. Nearly every week boasts at least one high-caliber conference, and they touch on everything from data journalism to Julia to in-memory data stores to cross-disciplinary analytics.  The following six caught my eye, but you can find more on the excellent conference discovery site lanyrd.com.

1)    DataGotham The pitch:

DataGotham is a celebration of New York City's data community that will bring together professionals from finance to fashion and from startups to the Fortune 500.

The speakers:

  • Michael P. Flowers, Director of  the Mayor’s Financial Crime Task Force and the NYC Policy and Strategic Planning Analytics Team
  • Steven E. Koonin, Director of  NYU's Center for Urban Science and Progress
  • Blake Shaw, Data Scientist at Foursquare
  • Adam Laiacano, Data Scientist at Tumblr
  • Alicia Rankin, Head of Research and Fan Insights for NFL
  • Jake Porway, Founder and Exec Director of DataKind
  • Matthew Israel, Director of the Art Genome Project at Art.sy

The sessions: Not yet available, but tutorials include “Data Journalism Fundamentals,” “MongoDB & R,” and “An Introduction to Julia” The space: NYU’s Stern School of Business (Paulson Auditorium and classrooms) The cost: $499, or $250 for academics and non-profits The dates: Sept. 13-14

2)    Strata Conference and Hadoop World The pitch:

The O’Reilly Strata Conference explores the changes brought to technology and business by big data, data science, and pervasive computing. This year, Strata has joined forces with Hadoop World to create the largest gathering of the Apache Hadoop community in the world. Strata brings together decision makers using the raw power of big data to drive business strategy, and practitioners who collect, analyze, and manipulate that data—particularly in the worlds of finance, media, and government.

The speakers:

  • Mike Olson, Cloudera
  • Alistair Croll, Solve for Interesting
  • Edd Dumbill, O’Reilly Media
  • Jim Adler, Chief Privacy Officer, Intelius
  • Abhijit Bose, Director and Senior Data Scientist of Digital Analytics, American Express
  • Alice Brennan, Journalist, the New York World

The sessions: Search and Real-time Analytics on Big Data, Moneyball for New York City, Analyzing Millions of GitHub Commits: What Makes Developers Happy, Angry, and Everything Inbetween?, Finance vs Machine Learning The space: New York Hilton The cost: $595-$2045 The dates: Oct. 23-25

3) Big Data Innovation The pitch:

The Big Data Innovation Summit is the largest gathering of Fortune 500 business executives leading Big Data initiatives.

The speakers:

  • Kurt Smith, Data Scientist at Twitter
  • Mohammad Sabah, Data Scientist at Facebook
  • Ashok Srivastava, Principal Scientist, NASA
  • Steve Hirsh, Chief Data Officer, NYSE
  • Arun Jacob, Director of Data Solutions, Walt Disney

The sessions: “Experimentation at eBay,” “Better Health through Data Science,” “Data-Infused Product Design and Insights at Linkedin” The space: Hyatt Regency Boston One The extracurriculars: Networking drinks in the Exhibitions area, 6pm, 9/13 The cost: $595-$2045 The dates: Sept. 13-14

4)    Government Big Data The pitch:

This outstanding conference brings together the key government and industry experts who are shaping the direction of big data research and development across the Federal Government. They willprovide you with an in-depth understanding of Federal agency strategy and plans, the status and forecast for key big data initiatives, and the latest tools and technologies being developed to exploit the massive amounts of information being collected at the Federal level.

The speakers:

  • Dr. Sasi Pillay, CTO for IT, NASA
  • Dr. Christopher White, Program Manager, Information Innovation, DARPA
  • Tasso Argyros, Co-President, Aster Data
  • Susie Adams, CTO, Microsoft Federal
  • Eric Braverman, Partner, McKinsey and Company

The sessions: “Perspectives from the Office of the Secretary of Defense,” ““National Aeronautics and Space Agency Perspectives and Initiatives,” “2.0: Surveillance Solution in the Cloud,” “We Didn’t Try to Grow a Bigger Ox: Why USASearch Uses Hadoop” The space: Holiday Inn Rosslyn at Key Bridge, Arlington, VA The cost: $1290 The dates: Sept. 18-19

5)    Visualized The pitch:

VISUALIZED explores the evolution of communication at the intersection of big data, storytelling and design. Gain insight into designing data-driven narratives that connect with audiences and visualize the human experience.

The speakers:

  • Shaw Hwang, Design Technologist, Trulia
  • Katy Harris, Information Designer, Fathom
  • Hilary Mason, Chief Scientist, Bitly
  • Simon Rogers, Data Journalist, Editor at Large, The Guardian UK
  • Scott Belksy, CEO, Behance
  • Shan Carter, Interactive Graphics Editor, New York Times

The space: Times Center Manhattan The cost: $799, or $699 if you donate a high-quality used book The dates: Nov. 8-9

6) Text Analytics World The pitch:

Text Analytics World is the full-spectrum conference that covers all aspects of text analytics. To solidify the business value you gain from text analytics, TAW delivers the latest methods/techniques, demonstrating their deployment across a wide range of industries large and small.

The speakers:

  • Sarah Ann Berndt, Taxonomist, Johnson Space Center
  • Anna Divoli, Senior Software Researcher, Pingar
  • Heather Edwards, Taxonomy Developer, AP
  • Sue Feldman, Research Vice President, Search and Discovery Technologies, IDC
  • Gregory Piatetsky-Shapiro, Editor, KDNuggets

The sessions: Predictive Coding in E-Discovery, Crossing the Language Chasm: Extracting Information from Foreign Language Text, Unified Access to Enterprise Information, Big Data and Big Analytics Trends: The Promise and the Hype, Harnessing the power of text analytics to drive human capital The space: Seaport World Trade Center The cost: $990-$1790 The dates: October 3-5

New York's Big Datascape, Part 2: Lua Technologies, 10Sheet, Hyperspace, Visible Market, Tout'd

[Image via Turf Geography Club]

This ongoing series examines some of the key, exciting players in New York’s emerging data-based startup scene. The companies I’m highlighting differ in scope, design, and target industries, but converge around similar needs to store, process, and analyze large amounts of web data. You can read about the first five companies here.

1) 10Sheet

  • Product: 10Sheet is cloud-based bookkeeping application. Users snap photos of their receipts, then email or Fedex them to 10Sheet to be scanned, recorded, and turned into complete financial statements. 10Sheet can also automatically pull in your bank and credit card transactions.
  • Founders: Ian Crosby, CEO, Jordan Menashy, CMO, Paul Rodionov, CTO, Adam Saint, Creative Director
  • Target industry: Small businesses and individuals in need of bookkeepers
  • Location: Flatiron
  • Funders: TechStars

2) Visible Market

  • Product: StockTouch displays the real-time performance of 1400 stocks in 9
  • industries in a touch-enabled heatmap. Tapping a stock lets you dive deeper into its background and historical performance.
  • Founders: Jennifer Johnson, CEO, Steve de Brun
  • Target Industry: Currently independent investors, but VM plans to role out a software kit for institutions in 2013.
  • Funders: FinTech Innovation Lab, Marek Fludzinsky

3) Lua Technologies

  • Product: Lua is a communication, collaboration, and scheduling platform for mobile teams. Think of it as GroupMe + Google Drive + Tungle.me.
  • Founders: Michael DeFranco, CEO, Jason Krigsfeld, Eli Bronner
  • Target industries: Lua’s made a splash in the entertainment industry, but the platform would be useful to any companies with multiple teams of mobile workers
  • Location: Flatiron
  • Funders: IA Ventures, Aaron Stone, Strauss Zelnick, John Maloney

4)Tout’d

  • Product: Tout’d is a Facebook –based recommendation app. Users can ask for a recommendation for, say, a barbeque shack in Red Hook, and Tout’d will pass the question on to Tout’d-using friends and friends of friends.
  • Founders: Arron Kallenberg, Rob Morelli, Saro Cutri
  • Target industry: Consumer
  • Location: Flatiron
  • Funders: Tout’d was just acquired by Localcents. Previous funders include Warner Hill Angels

5) Hyperspace

What cool big data startups are you following (or founding. Or funding)? Let me know in the comments!

 

 

 

 

 

New York’s Big Datascape, Part 1: Timehop, Parse.ly, Bitly, 10Gen, 2tor

[Image via AllThingsD]

I started writing about innovators in Boston’s big data scene in the earliest days of Riparian. Researching what other companies were building, analyzing, and selling provided me with a narrative to what might otherwise still be a murky set of concepts.  It also introduced me to some fascinating ideas—Bluefin Labs’ TV Genome and Recorded Future’s event forecasting come to mind. And so, nearly two months in to my New York sojourn, I’m expanding this series in the hopes of making the acquaintance of these companies’ NYC equivalents.

Some people like to say that New York and Boston are rivals. When it comes to sports, I think this is valid; when it comes to technology, I think it’s silly. By and large, the technology each city produces serves different sectors—life sciences, healthcare, and higher ed in Boston, fashion, media, finance, and consumer web in New York. Of course, there are exceptions (there are always exceptions)—but exceptions are testaments to heterogeneity, not (usually) harbingers of power shifts. Four of the following companies serve one or more of the city’s main sectors; the fifth serves higher ed, a sector that, especially these days, needs to be better served everywhere.

Timehop

Parse.ly

Bitly

10Gen

  • Product: 10Gen makes MongoDB, which is a distributed database that stores data in JSON/BSON documents (think MySql with a document-based data model).
  • Founders: Dwight Merriman, CEO (@dmerr), Eliot Horowitz, CTO (@eliothorowitz)
  • Technology Used: MapReduce, Aggregation Framework, atomic operations
  • Target industries: Consumer web, Digital Media, Mobile
  • Location: Soho (Also, Palo Alto, CA)
  • Funders: Flybridge Capital Partners, Sequoia Capital, Union Square Ventures 

2tor

 

Pleased to Meetcha: Big Data Meetups in New York and Boston

I moved to New York this past Monday. It’s been a year and a half since I last lived here, and changes which were fledgling when I left (Silicon Alley, Williamsburg waterfront, smoothie stands) are full-fledged now. Part of my reason for moving was selfish—I missed the city. Part of my reason was practical—New York’s big data ecosystem is leaner, in some areas faster, more focused on the consumer: in other words, it’s a good complement to the rich, database- and life-science/healthcare-focused ecosystem in Boston.

While I’ve shifted my residential alliances to New York, professionally, I’ll be keeping a foot in each city, exercising my role as Riparian’s mouth by meeting as many data scientists, analysts, and high-volume email users as I can, partly on an individual basis, and partly through meetups like the ones listed below. (At last, I get to the point of this post!). The following are just a small selection of data-related meetups in New York and Boston, but I think they ably represent some of the buzziest aspects of and players in this very buzzy topic.

 New York

1. Analytics and Data in Financial Services

  • Description: Knowledge of and fluency in the language of analytics is becoming increasingly important in business, especially in the financial services industry.
  • Aimed at: People doing big data analysis, especially those in the FS arena
  • Hosted by: Jaime Fitzgerald (in / t)
  • Members: 346
  • Past meetups: “From Tufte to the Magic Kingdom: Telling the Story Behind the Data,” “Member Demos of Predictive Models”

2. Predictive Analytics, Applied Machine Learning, Big Data

  • Description: Discuss diverse topics in predictive analytics and applied machine learning.
  • aimed at: Analysts, computer scientists, engineers, executives, entrepreneurs and students with a deep interest in these fields & related technologies.
  • hosted by:
  • Alex Lin (in / t)
  • members: 2155
  • Past/Upcoming meetups: “Designing Machine Learning Algorithms for Hadoop,” “The art of predictive analytics: More data, same models”

3. Open Analytics NYC

  • Description: A group devoted to the use and development of open source, big data, agile intelligence solutions, for the NYC Metro area.
  • Aimed at: People interested in solving real business problems utilizing open source, big data analytical solutions.
  • Hosted by: Scott Raspa (in / t)
  • Members: 152
  • Upcoming meetup: “How to Gain Intelligence from Open Analytic Solutions using MongoDB & Hadoop”

4. Digital Semiotics

  • Description: Talk about the relationship between semiotics and digital technology, covering topics around: traditional semiotics, computational semiotics, computational linguistics, user experience, interface design, interaction design, information architecture, robotics/intelligent machines, human-computer interaction
  • Aimed at: Academics, advertising professionals, independent researchers, computer scientists, digital anthropologists, linguists, designers
  • Hosted by: Thomas Wendt (b / t)
  • Members: 44
  • Upcoming meetup: n/a

5. NoSQL NYC

  • Description:  Discuss any alternative databases, from large distributed key-values hashtables to document-stores.
  • Aimed at: NoSQL enthusiasts
  • Hosted by: Edward Capriolo (b / in), Mark Pollack (in)
  • Members: 930
  • Upcoming/past meetups: “The Graph in Your Data - A Neo4j Intro & An Intro to GoldenOrb”

Boston

1. Boston Predictive Analytics

  • Description:  Present informative lectures, hands-on tutorials, networking events, etc. Three main focal points:  business applications, advanced mathematics, and computer science; with topics covering recommender systems, machine learning, Google Analytics, data visualization, social media / text analytics, and related topics.
  • Aimed at: ML/NLP/Data Miners/Modelers
  • Hosted by: John Verosteck (b / t)
  •  Members: 1139
  • Past/upcoming meetups: “Content Recommendations Using Bayesian Classification via Apache Mahout”

2. Open Analytics Boston

  • Description: see “Open Analytics New York”
  • Aimed at: ditto
  • Hosted by: Scott Raspa
  • Members: 95
  • Upcoming meetups: n/a

3. The Data Scientist

  • Description: This group will concentrate on understanding the tools and skill-sets needed to become an effective Data Scientist. We will explore all topics related to the data lifecycle including acquiring new data sets, parsing new data sets, filtering and organizing data, mining data patterns, advanced algorithms, visually representing data, telling stories with data and softer skills such as negotiations and selling your ideas based upon data.
  • Aimed at: Data scientists, and those who aspire to that title.
  • Hosted by: John Baker (in / t), Carrie Stalder (in / t)
  • Members: 158
  • Upcoming meetups: n/a

4. Emerging Business Technology

  • Description: Provides engineers, practitioners and managers the context needed to evaluate and adopt rapidly evolving business technologies. Leave with an understanding of what the technology is, why it’s used, when to use it, and next steps to take. We’ll review use cases, processes, tools, and practices in a mini-conference format through short presentations, hands-on tutorials, Q&A and code walkthroughs. Topics may include Mobile app development, HTML5, responsive design, high-concurrency applications, interaction design, modern languages and frameworks, NoSQL databases, and mobile / tablet application design.
  • Aimed at: Engineers and business users.
  • Hosted by: Dan Adams (b / t)
  • Members: 172
  • Upcoming meetups: “NoSQL in the Real World”

5. Boston Hadoop User Group

  • Description: For developers who are using Hadoop, or would like to learn more about it. Also includes technologies built with/on top of Hadoop: Hive, Pig, HBase, etc.
  • Aimed at: See above. Essentially, anyone interested in big data technologies (not only Hadoop-specific ones, though those are the foci).
  • Hosted by: Reed Shea (in / t) and, last month, myself (in / t)
  • Members: 761
  • Upcoming/past meetups: “Training Session with Hortonworks,” “More Data vs. Better Data vs. Better Algorithms