This document discusses using cells in OpenStack Nova to scale deployments. Cells create a hierarchy with a top-level API cell and multiple compute cells. Each cell has its own database, message queue, and services. The document outlines planning the conversion, preparing the environment by expanding RabbitMQ and splitting services, configuring the compute and API cells, importing data to the API cell, and restarting services. It notes some caveats like limitations on certain notifications and objects between cells.
3. CELLS INTRODUCTION
Use cells to overcome …
• Large number of nova-computes
• Single message queue instance
• Complicated scheduling
• Multi-site behind one API
3
4. CELLS INTRODUCTION
Cells defined
• Hierarchy of Nova instances
• Each has database, message queue, scheduler, and compute
• Message routing between cells to perform operations
• Top-level API cell for nova-api and cell scheduling
• Overrides the default compute API class
• Lots of caveats
• This is cells v1 (v2 in Liberty)
4
7. CELLS INTRODUCTION
More details to get started
• Nova cells configuration reference
• http://docs.openstack.org/juno/config-reference/content/section_compute-cells.htm
• Openstack-dev cells disucssions
• http://www.gossamer-threads.com/lists/openstack/dev/16277
• CERN’s cells architecture
• http://openstack-in-production.blogspot.com/2014/03/cern-cloud-architecture-update-for.html
• Folsom cells design summit slides
• http://comstud.com/FolsomCells.pdf
• Exploring OpenStack Nova Cells
• http://www.dorm.org/blog/exploring-openstack-nova-cells/
• Talks by Rackspace, CERN, NeCTAR
7
8. PLANNING THE CONVERSION
Goals
• Get to cells before scaling fire drill
• Keep nova RMQ, DB close to compute nodes
• Maintain existing instances state
• Little or no downtime
8
9. PLANNING THE CONVERSION
Basic plan
• Existing nova becomes first compute cell
• Split RMQ cluster
• Create new nova instance for API cell
• Import data to API cell
• Existing nova-api service until final cutover
9
11. ENVIRONMENT PREP
Getting ready
• New servers for the API cell services
• Database for nova API cell
• Migrate non-nova services to new machines
• Network ACLs
• Check DNS
11
12. ENVIRONMENT PREP
Extra credit: Split RabbitMQ cluster
• Not strictly necessary!
• To minimize downtime and maintain state
• First add new nodes
• Split and contract cluster
12
13. heat neutron glance
nova ceilometer
ENVIRONMENT PREP
Expand RabbitMQ cluster
13
Original RMQ/App Servers
(to be: compute cell)
14. heat neutron glance
nova ceilometer
ENVIRONMENT PREP
Expand RabbitMQ cluster
14
Original RMQ/App Servers
(to be: compute cell)
New RMQ/App Servers
(to be: API cell)
15. heat neutron glance
nova ceilometer
ENVIRONMENT PREP
Expand RabbitMQ cluster
15
Original RMQ/App Servers
(to be: compute cell)
New RMQ/App Servers
(to be: API cell)
19. CONFIGURE COMPUTE CELL
Set up record for parent cell
nova-manage cell create
--name=api --cell_type=parent
--username=api_rmq_user --password=api_rmq_pass
--hostname=api_rmq_host --virtual_host=api_rmq_vhost
• Use the API cell RMQ servers!
• Or use cells_config option and put this in json
http://docs.openstack.org/juno/config-reference/content/section_compute-cells.html#cell-config-optional-json
19
21. CONFIGURE COMPUTE CELL
Enable nova-cells in compute cell
[cells]
enable = true
name = cell_01
cell_type = compute
• Start up nova-cells, verify connections to RMQ
• Do not restart nova-api after this!
21
22. CONFIGURE COMPUTE CELL
Disable quotas in compute cell
• Quotas will be enforced by the API cell
[DEFAULT]
quota_driver=nova.quota.NoopQuotaDriver
22
23. BOOTSTRAP NOVA FOR API CELL
Install & configure nova as usual
• Install packages, db sync
• Use the API cell RMQ servers!
• Configure cells options
[cells]
enable = true
name = api
cell_type = api
• Don’t start services yet (need to import data)
23
24. BOOTSTRAP NOVA FOR API CELL
Set up record for child cell
nova-manage cell create
--name=cell_01 --cell_type=child
--username=comp_rmq_user --password=comp_rmq_pass
--hostname=comp_rmq_host --virtual_host=comp_rmq_vhost
• Use the compute cell RMQ servers!
• Remember cells_config/json option
24
26. IMPORT NOVA DATA
Seed API cell data
• API cell needs flavor, quota, instance, etc. data
• Must do this directly in SQL
• Shut down nova-api to prevent changes while you do this
mysqldump nova_orig_db table_name |
mysql nova_api_cell_db
26
27. IMPORT NOVA DATA
Tables to import
• instance_types
• instance_type_extra_specs
• instance_type_projects
• instances
• instance_info_caches
• block_device_mapping
• instance_system_metadata
• instance_groups
• instance_group_member
• instance_group_metadata
• instance_group_policy
• key_pairs
• quota_classes
• quota_usages
• quotas
• snapshots
• snapshot_id_mappings
• virtual_interfaces
• volumes
• May be others you need!
27
30. CAVEATS
Things that just don’t work
• Neutron vif plugging notifications to nova
vif_plugging_is_fatal = false
vif_plugging_timeout = 5
(But this causes a race condition)
• Any notifications between cells and other services
ceilometer
http://openstack-in-production.blogspot.com/2014/03/cern-cloud-architecture-update-for.html
30
31. CAVEATS
Things that just don’t work
• nova cells-list “circular reference detected” bug
https://bugs.launchpad.net/nova/+bug/1312002
https://review.openstack.org/#/c/106991/2/nova/cells/state.py
• Console Auth
Make sure to set cells/enable=true on all node types
http://blog.mgagne.ca/nova-cells-and-console-access/
31
32. CAVEATS
Some objects are not cell-aware
• Flavors and Server Groups
Must exist in API cell and compute cell DB (with same IDs!)
https://github.com/NeCTAR-RC/nova/commit/5abc8847dc89b162b6ae678176a5cfe4989144a9
• Block Devices
http://blog.mgagne.ca/nova-cells-and-block-device-mapping/
• Security groups
• ???
32
33. CAVEATS
Host aggregates and availability zones
nova-api server read cell state from DB:
https://github.com/NeCTAR-RC/nova/commit/6fe7057fb4957485d3bac06579ddc38c93458064
Add AZ support for cells:
https://github.com/NeCTAR-RC/nova/commit/048bd2d6d438fb8fa9ad7d3e0d57e7d03c546f6f
Support aggregate API in cells:
https://github.com/NeCTAR-RC/nova/commit/8ca8828d191bc271460eb80567717fd15ef6167c
Ability to filter cells capacity report:
https://github.com/NeCTAR-RC/nova/commit/97921ef1010c5e5bca357d77682bd0ee42d6ffcc
Print cell name in cell timeout exceptions:
https://github.com/NeCTAR-RC/nova/commit/60f669ba1ed5221d71138a72fb2cf3b34c07a970
Use sysmetadata to get instances AZ in API cell:
https://github.com/NeCTAR-RC/nova/commit/95e4cccac623c601e074a618ea71d121a359e00f
Use sysmetadata to get instance_name in API cell:
https://github.com/NeCTAR-RC/nova/commit/6bf1cf78b86bed99733e1119b891397dee15a65e
33
35. CAVEATS
Other issues
• nova.cells.messaging errors
nova.cells.messaging OperationalError: (OperationalError) (1048, "Column 'instance_uuid' cannot be null") 'UPDATE
instance_extra SET updated_at=%s, instance_uuid=%s WHERE instance_extra.id = %s’
No clue on this, but doesn’t seem to break anything
• Database consistency between API and compute cells
Communication interruption between cells can cause this
Use case for running nova-api in compute cells
35
36. CELLS V2
A better way forward for nova
• Cells is the default mode
• No nova-cells service
• nova-api calls directly to each cell’s DB and message
queue
https://wiki.openstack.org/wiki/Nova-Cells-v2
https://etherpad.openstack.org/p/kilo-nova-cells-manifesto
36
37. CELLS V2
Give me Liberty or give me death!
• Experimental in Liberty
• Transition from no cells v2 should be seamless
• Unclear how cells v1 will migrate to v2
• Unless you really need to go to cells right now …
… wait for Liberty
37