Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Performance Tuning for MySql

See original GitHub issue

Background Just a tracker issue for doing some performance tuning for mysql. Looking to capture notes on this card along the way.

As we move all repositories to mysql, it is imperative to understand the operational characteristics of MySQL especially as it pertains to vinyldns.

The critical aspects from a size and performance standpoint are as follows.

Bulk inserting for record set and record changes When a zone is loaded into VinylDNS for the first time, we run a AXFR full zone transfer and store all record sets into the vinyldns system. We also store a corresponding record change for each record set at the same time.

For most zones, performance is a non-issue. However, when zones have 100,000s or 1MMs of record sets in them, this operation can be understandably expensive.

Archiving of record changes With DynamoDB, it is relatively cheap to store long-term all data in the record change table. However, with MySQL, disk space is limited wrt DynamoDB. We need to have a plan for archiving record changes, potentially configuration driven, with background jobs / processes / stored procedures and triggers that will move data past a certain age into some kind of long-term backup.

This may not be necessary for all vinyldns users, as it depends on data size.

Query performance for record set and record changes VinylDNS installations may have 1MMs of record sets and record changes. For example, we anticipate Comcast to grow to 100MMs of record sets and 100GB of record changes. For this size, it is imperative to have an archiving strategy for old record changes.

Indexing and query performance will be critical to the responsiveness of the application. We have to fine tune indexes and understand query execution plans for record and record change data.

Notes

Bulk Insert MySql.

Bulk insert in MySQL is achieved using sql INSERT INTO table(a, b, c) VALUES (?, ?, ?), (?, ?, ?), (?, ?, ?)

The SQL shows that you can keep appending additional rows under the VALUES clause seemingly indefinitely.

The number of rows you can insert in a single query is depending on a number of factors, crucially the max_allowed_packet

There are a number of ways to do bulk insert using mysql, including file loading. High-speed inserts with MySQL covers the different ways to bulk load data. Key takeaway: extended inserts achieve 201,000 inserts per second with 26 bytes per row average

Proposed recordset table

CREATE TABLE recordset (
  id CHAR(36) NOT NULL,
  zone_id CHAR(36) NOT NULL,
  name VARCHAR(256) NOT NULL,
  fqdn VARCHAR(256) NOT NULL,
  type TINYINT NOT NULL,
  data BLOB NOT NULL,
  PRIMARY KEY (id),
  INDEX zone_id_name_index (zone_id, name, type),
  INDEX fqdn_index (fqdn, type)
);

Inspecting the record set table, we have the following field sizes:

id - ascii - CHAR(36) - 1 byte per character = 36 bytes
name - ascii - VARCHAR(256) - 1 byte per character = 36 bytes
fqdn - ascii - VARCHAR(256) - 1 byte per character = 36 bytes
type - TINYINT - 1 byte per entry = 1 byte
data - BLOB - the protobuf byte array of the record set. Currently, this is roughly 150 bytes

IMPORTANT: This makes each record in the record set table ~260 bytes

For example, sending 1000 inserts in a batch would be approximately 260,000 bytes of data.

max_allowed_packet The MAX max_allowed_packet in MySQL is presently 1GB. That size is seemingly untenable to work with.

If we were to work with a 4MB packet size, we should be able to batch 10,000 records per query with lots of head room (10,000 records x 260 bytes = 2.6MB). Headroom is necessary in the event we need to add more fields, or if we max out all of the field sizes in a row.

CONCLUSION The performance of MySQL at our sizes (100MMs of records) and query access patterns seems reasonable. It also gives us the added benefit of easier querying of SQL.

The downside comes with locking / transactions. VinylDNS is built with an eventually consistent / last-write-wins model for persistence. We are happy to overwrite records in the database. Unfortunately, there is no way to accomplish this with InnoDB for MySQL without the additional overhead of locking that comes.

Insert Testing At very large data sizes (250MM records, 250MM record set changes, 2MM zones) we were able to sustainable achieve insert throughput of 2,200 record sets + changes per second. Subsequent testing revealed that could be as high as 3,000 RPS. Given that large zones enter VinylDNS sporadically, 2,200 RPS is more than sufficient to meet our needs.

Query Testing At large data sizes (100MM record sets), all record set queries consistently returned in very low millisecond resolution across all query types. Assuming that indexes are set up properly, queries should continue to be responsive.

Changes needed

DynamoDB is seemingly “unbounded” in data sizes and memory; whereas, MySQL is not. To fully migrate to MySQL, we must have an archiving strategy for old zone and record changes.
Support a separate writer and reader connections to MySQL. Aurora specifically supports this option automatically, and it would allow us to run lock-free queries while contention happens in the writer endpoint.

Changes recommended

Zone Loading currently just follows a zone sync process to load zones into the database. Converting DNS zone data via AXFR to record sets takes up a lot of memory. We should stream via FS2 the changes into the database instead, avoiding memory pressure.
Zone syncing is extremely expensive from a memory perspective. We load all record sets from VinylDNS, as well as all record sets from the DNS AXFR, and perform a diff merge process. If we have very large zones approaching 1MM records, we may run out of memory. A different process for doing zone syncing should be invented (requires design) that can do the same diff merge process but in a more memory safe way.

TABLED DISCUSSION

We should have a separate process for initial loading of zones other than zone syncing. This process can be built using FS2, where we stream / chunk records and record changes in. This will keep the amount of memory used during loading of large zones well managed. make this an issue!
Have to figure out how to better do zone syncing for large zones. Presently, we load all of our records out of the database into an in-memory data structure, load all of the DNS records via a AXFR, and do a diff/merge process on the two. We maybe able to accomplish a better version with intelligent record lookup from vinyldns, mitigating the need to do an entire bulk load for zone syncing process. this should probably be done!

Issue Analytics

State:
Created 5 years ago
Comments:14 (14 by maintainers)

Top GitHub Comments

1reaction

pauljamesclearycommented, Oct 16, 2018

Local testing

100,000 records
data grouped into 1000
ran in parallel

Results 4 seconds!

0reactions

pauljamesclearycommented, Oct 25, 2018

Query Results - on server The query results running in open stack had a marked improvement over those running on local machine over VPN.

Will not include all samples, but most query response times are in very low millisecond.

vinyldns.core.query.getRecordSetsByNameAndType.xxl.latency @timestamp:October 25th 2018, 15:54:49.000 count:100 max:18 mean:1.764 min:1 p50:1 p75:2 p95:3 p98:5 p99:18 p999:18 stddev:1.85

vinyldns.core.query.listRecordSets.xxl.latency @timestamp:October 25th 2018, 15:54:49.000 count:100 max:17 mean:2.362 min:1 p50:2 p75:2 p95:4 p98:6 p99:7 p999:17 stddev:1.647

vinyldns.core.query.getRecordSetsByFQDN500.xxl.latency @timestamp:October 25th 2018, 15:54:49.000 count:100 max:15 mean:5.053 min:4 p50:5 p75:5 p95:8 p98:10 p99:15 p999:15 stddev:1.766

vinyldns.core.query.getRecordSetsByName.xxl.latency @timestamp:October 25th 2018, 15:54:49.000 count:100 max:4 mean:1.369 min:1 p50:1 p75:2 p95:2 p98:3 p99:4 p999:4 stddev:0.592

Top Results From Across the Web

MySQL 8.0 Reference Manual :: 8 Optimization

Optimization involves configuring, tuning, and measuring performance, at several levels. Depending on your job role (developer, DBA, or a combination of both), ...

MySQL Performance Tuning Tips To Optimize Database

Exclusive MySQL Performance Tuning Tips For Better Database Optimization · Avoid using functions in predicates · Avoid using a wildcard (%) at the ......

Performance Tuning in MySQL - Stackify

Software-Level Performance Tuning in MySQL · Code Profiling · Queries · Automatic Performance Improvement · Server Adjustments · EXPLAIN · JOIN, UNION ...

MySQL Performance Tuning: Tips, Scripts and Tools

It's essential to ensure that your performance issues are not due to poorly written MySQL queries. You can use MySQL's slow query log, ......

Best Practices for MySQL Performance Tuning in 2022 - Turing

MySQL performance tuning is the process of configuring and optimizing a MySQL database server to achieve the best possible performance. This can involve ......