Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Honeycomb Under the Hood

Honeycomb Under the Hood

Honeycomb combines the raw accuracy of log aggregators, the speed of time series metrics, and the flexibility of APM (application performance metrics) to provide the world's first truly next-generation analytics service. Originally modeled off of Facebook's Scuba data platform, it has spun off to become an intuitive, delightful tool for exploring every part of your stack, from debugging slow queries to db internals, from application code to the network stack, to deep security dives.

If you can represent it in a structured data event, Honeycomb can let you slice and dice in real-time. Honeycomb supports arbitrarily wide events and lets you query on as many attributes as you want. Come learn how Honeycomb is designed, from the custom-built column store up to the fan-out query model. Our mission is to make Honeycomb the tool that helps everyone be as good as the best debugger on your team, but this session will be a deep dive under the hood into how it really works, tradeoffs, deals with the devil and all.

Christine Yen

April 25, 2017
Tweet

More Decks by Christine Yen

Other Decks in Programming

Transcript

  1. Speed of Time Series + Raw Power of Rich Events

    = Interactive, Iterative Debugging for Systems
  2. ORIGINS: FACEBOOK’S "SCUBA" "A fast, scalable, distributed, in-memory database built

    at Facebook" "used extensively for interactive, ad hoc, analysis queries that run in under a second over live data" + Parse
  3. ORIGINS: FACEBOOK’S "SCUBA" ▸ Flexibility to dive into dependencies and

    natural partitions in the data ▸ Fast enough to support natural human query patterns ▸ Started out to debug MySQL performance regressions ▸ Now used anywhere repeated ad-hoc analysis is needed
  4. DESIGN GOALS ▸ Read-time aggregation ▸ Ability to reconstruct raw

    rows ▸ Flexible schemas and sparse rows ▸ Speedy analytical reads ("best effort availability") ▸ Near-realtime behavior ▸ Real-world hardware constraints :)
  5. DESIGN GOALS ▸ Read-time aggregation ▸ Ability to reconstruct raw

    rows ▸ Flexible schemas and sparse rows ▸ Speedy analytical reads ("best effort availability") ▸ Near-realtime behavior ▸ Real-world hardware constraints :)
  6. DESIGN GOALS ▸ Read-time aggregation ▸ Ability to reconstruct raw

    rows ▸ Flexible schemas and sparse rows ▸ Speedy analytical reads ("best effort availability") ▸ Near-realtime behavior ▸ Real-world hardware constraints :)
  7. DESIGN GOALS ▸ Read-time aggregation ▸ Ability to reconstruct raw

    rows ▸ Flexible schemas and sparse rows ▸ Speedy analytical reads ("best effort availability") ▸ Near-realtime behavior ▸ Real-world hardware constraints :)
  8. DESIGN GOALS ▸ Read-time aggregation ▸ Ability to reconstruct raw

    rows ▸ Flexible schemas and sparse rows ▸ Speedy analytical reads ("best effort availability") ▸ Near-realtime behavior ▸ Real-world hardware constraints :)
  9. DESIGN GOALS ▸ Read-time aggregation ▸ Ability to reconstruct raw

    rows ▸ Flexible schemas and sparse rows ▸ Speedy analytical reads ("best effort availability") ▸ Near-realtime behavior ▸ Real-world hardware constraints :)
  10. DESIGN GOALS ▸ Read-time aggregation ▸ Ability to reconstruct raw

    rows ▸ Flexible schemas and sparse rows ▸ Speedy analytical reads ("best effort availability") ▸ Near-realtime behavior ▸ Real-world hardware constraints :)
  11. INGESTION GOALS ▸ Simple, straightforward, and fast ▸ Don’t spend

    innovation tokens here ▸ SSDs are cheap enough to support needed speeds ▸ Rely on filesystem to help with things like pruning
  12. INGESTION FLOW 42, 1493025003, { 1345: "POST", 1373: 27.523 …

    } Dataset ID: 42 Timestamp: 1493025003 Column ID: 1345, Value: "POST" Column ID: 1373, Value: 27.523 { }
  13. INGESTION FLOW 42, 1493025003, { 1345: "POST", 1373: 27.523 …

    } Dataset ID: 42 Timestamp: 1493025003 Column ID: 1345, Value: "POST" Column ID: 1373, Value: 27.523 { 3:1493025004 3:"POST" 3:27.523 1:1493025002 1:0.00 0:1493025000 0:"GET" 0:4.208 2:1493025003 } Col 0 Col 1345 Col 1373
  14. INGESTION FLOW 42, 1493025003, { 1345: "POST", 1373: 27.523 …

    } Dataset ID: 42 Timestamp: 1493025003 Column ID: 1345, Value: "POST" Column ID: 1373, Value: 27.523 3:1493025004 3:"POST" 3:27.523 1:1493025002 1:0.00 0:1493025000 0:"GET" 0:4.208 2:1493025003 1:1493025007 0:1493025006 0:"POST" 0:22.199 { } Min/Max Timestamp Latest Index Latest Kafka Offset Min/Max Timestamp Latest Index
  15. INGESTION YOU MAY NOTICE ▸ No indices to maintain on

    the write path ▸ No compaction or compression ▸ Open road to optimizations
  16. READS GOALS ▸ Only pull minimum data necessary to answer

    the question ▸ Approximate whenever possible ▸ Even if data ages out, results (previously-run queries) shouldn’t
  17. READS FLOW AVG(total_ms) where method = "POST" 3:1493025004 1:1493025002 0:1493025000

    0:"GET" 0:4.208 2:1493025003 2:app16 1:app7 0:app25 3:app25 2:js 1:android 0:ios 3:js 2:0.027 1:1.253 0:0.497 3:2.119 2:"POST" 2:0.00 3:"POST" over the last 2 hours
  18. READS FLOW AVG(total_ms) where method = "POST" 3:1493025004 1:1493025002 0:1493025000

    0:"GET" 0:4.208 2:1493025003 2:app16 1:app7 0:app25 3:app25 2:js 1:android 0:ios 3:js 2:0.027 1:1.253 0:0.497 3:2.119 2:"POST" 2:0.00 3:"POST"
  19. READS FLOW AVG(total_ms) where method = "POST" 3:1493025004 1:1493025002 0:1493025000

    0:"GET" 0:4.208 2:1493025003 2:app16 1:app7 0:app25 3:app25 2:js 1:android 0:ios 3:js 2:0.027 1:1.253 0:0.497 3:2.119 2:"POST" 2:0.00 3:"POST"
  20. READS FLOW AVG(total_ms) where method = "POST" 3:1493025004 2:"POST" 1:1493025002

    2:0.00 0:1493025000 0:"GET" 0:4.208 2:1493025003 2:app16 1:app7 0:app25 3:app25 2:js 1:android 0:ios 3:js 2:0.027 1:1.253 0:0.497 3:2.119 3:"POST" = 1.073
  21. TRADEOFFS ▸ Always prefer availability and speed (from the perspective

    of the user) ▸ Write-heavy workload means we optimize for write performance (append-only ingest) ▸ Optimizing reads is easier than optimizing writes ▸ Users can input particular patterns to degrade their own query performance