Designing monolithic infrastructures is a common mistake in large projects. However, more often than not, these infrastructures are too generic, make false assumptions or are simply delivered too late for feature developers to use, becoming "white elephants".
This presentation is a case study of the work done by my team to deliver Live Merging of Snapshots oVirt from the initial steps in oVirt 3.1.0 to the full delivery in 3.5.0, and how good design can be feature-driven, building infra-structures step by step, while gaining small wins during the process.
Step by Step - Reusing old features to build new ones
1. DevConfCZ, Feb 2015
Step by Step
Reusing old features to build new ones
Allon Mureinik
amureini@redhat.com / @mureinik
Supervisor, RHEV Storage
Red Hat
DevConfCZ, Feb 2015
2. DevConfCZ, Feb 2015
Old school software development
Requirements
Design
Implementation
Verification
Maintenance
6. DevConfCZ, Feb 2015
What Is oVirt?
Large scale, centralized
management for server and
desktop virtualization
Based on leading performance,
scalability and security
infrastructure technologies
Provide an open source
alternative to vCenter/vSphere
Focus on KVM for best
integration/performance
Focus on ease of use/deployment
7. DevConfCZ, Feb 2015
oVirt: Not a Single Project
● oVirt-Engine
● VDSM
● oVirt-Node
● ovirt-Engine-SDK
● oVirt-Engine-CLI
● oVirt-Guest-Agent
● oVirt-Image-Uploader
● oVirt-iso-Uploader
● oVirt-Log-Collector
● oVirt-DWH
● oVirt-Reports
● Incubation Projects
– MOM
– moVirt
● Test Projects
● … your contributions are
welcome!
● See http://ovirt.org for details
9. DevConfCZ, Feb 2015
Live Snapshot
● Capture disks and
memory at a point in
time
● Implemented using
qcow2 volume chains
● Usages
– Save the state before
a major change
● Can be previewed or
reverted
– VM live backup
– Live Storage
Migration
10. DevConfCZ, Feb 2015
The next logical step...
● Bug 647386 - Support live deletion of a
snapshot / live-merge
– Reported against RHEVM 2.3.0 (Oct. 2010)
– 27 customer tickets
– http://www.ovirt.org/Features/Live_Merge
● So what's the big deal?
18. DevConfCZ, Feb 2015
Problem 2 – long running tasks...
● Up to 3.5.0, oVirt has two kinds of verbs
to communicate with VDSM:
– Synchronous verbs
● Finish in under 3 minutes
● Give result immediately
– Asynchronous
● May take a long time to complete
● Return a task to be monitored
● Only run on SPM
– Engine commands have up to 3 stages
● executeAction() - Synchronous database + VDSM
● Poll the task until it completes (or fails)
● endSuccefully() / endWithFailure()
19. DevConfCZ, Feb 2015
Solution 2 - SEAT
● A mechanism was added for Serial
Execution of Asyncronous Tasks
– http://wiki.ovirt.org/Features/Serial_Executi
on_of_Asynchronous_Tasks
– Allows creating chains of actions:
● execute
● poll for a task
● move to the next execution...
● ... or rollback everything
20. DevConfCZ, Feb 2015
Solution 2 – Why would I even...
● Live Storage Migration
– http://www.ovirt.org/Features/Design/Storage
LiveMigration
● Utilizes SEAT for a series of tasks:
– [Live Snapshot – not mandatory]
– Clone image structure
– Start syncing active image
– Sync backing chain
– Stop sync
– Remove (and wipe) source
21. DevConfCZ, Feb 2015
Problem 3 – Still only SPM tasks
● Up to 3.5.0, only SPM can run
asynchronous tasks
– This is due to the requirement to persist task
info on the master domain
22. DevConfCZ, Feb 2015
Solution 3 – HSM “Tasks”
● Separate the coordination code from the
polling code
– http://www.ovirt.org/Features/Design/Comma
ndCoordinator
– Report the progression of the block job on the
pooled VM stats
● Now the HSM that runs the VM can run
the merge verb
– The basis for rewriting VM migration
– The basis for removing the SPM completely
● Come here all about it in DevConfCZ 2016!