2. Before we start...
This is a purely technical discussion,
don’t bring politics in. That is:
• Which dept. should in charge?
• Why not develop in PHP/Java because nobody else in the
company can program in Ruby.
• How to integrate NML into XX system?
3. Goal
Out-of-band Server Management
Extremely configurable OS install via
SOL(Serial Over Lan)
An intelligent system to control the whole
process, minimum human intervention
Build an open-source matrix for Server/OS
distro combinations
4. Status
Member : me wangjunyan (docs)
Subproject Member: lijiehui (LXC:
Linux container environment)
github: https://github.com/op-sdo-com/nml
Fork us!
6. HP is a coward(think
their WebOS)
IBM, Dell, HP
HP closed ipmi port(udp 623), force
customers to use iLO.
Practically, iLO is okay. But you need
to buy a license before using Remote
console redirection while IBM & Dell
let you do anything!
7. Work through
10.132.17.100-150 (prod. IP range)
10.132.17.200-250 (IPMI IP range)
One-to-One mapping (dynamic IP
allocation is just impossible for now,
but this can be improved)
The current solution is neither secure nor
sufficiently isolated.
8. Work through
1. Set to boot from PXE then restart:
ipmitool -I lanplus -U ibm3550 -H 10.132.17.200 -P XX chassis bootdev
pxe
ipmitool -I lanplus -U ibm3550 -H 10.132.17.200 -P XX chassis power
cycle
2. Configure DHCP sever to reply by MAC and refuse any
other DHCP request(!!)
PS dhcp3 supports dynamic configuration update via
OMAPI. see man dhcpd.conf
9. Architecture
NML’s encapsulates all the
intelligence in HTTP.
DHCP and iPXE configurations are kept
to a minimum.
Centralized configuration is easy to
maintain.
10. Work through
host aoti_200 {
# eth0, eth1
hardware ethernet 00:1A:64:99:E7:50;
# hardware ethernet 00:1A:64:99:E7:52;
fixed-address 10.132.17.109;
server-name "10.132.17.108";
if exists user-class and option user-class = "iPXE" {
filename "http://10.132.17.108/nml/ipxe";
} else {
filename "undionly.kpxe";
}
}
11. Work through
iPXE V.S. PXE
iPXE liberate us from TFTP(stupid
UDP). iPXE supports HTTP(even iSCSI),
so the system scales.
iPXE lays the foundation to an
automatic assessment management
platform.
13. Work through
From now on, all the network
communication is done through HTTP.
Also, the intelligence comes in:
get '/nml/pxelinux.cfg/:uuid' do
uuid = params[:uuid]
install(uuid, get_ipaddr(uuid), get_gateway(uuid),
get_hostname(uuid), get_iface(uuid), get_baudrate(uuid),
get_release(uuid))
end
16. Architecture
NML tries to provide maximum flexibility from the bottom.
Policy makers decided how to utilize it.
Maximum flexibility == Each machine can pull its own
configuration set.
NML tries hard to be OS/Hardware independent. (Goal 3:
build a matrix)
17. Architecture
I know real world op desperately want consistency, but
this is policy.
NML focus on Mechanism.
Why flexibility matters? Any real world examples?
1. Let the system generate distinct password for every machine. I love elegant
solution to security.
2. Gain access to partition manager. (ext3, ext4, btrfs and LVM!)
3. Move prelinux script to the preseeding stage ensure a continuous integration
of company policy (Lessons: Polices can never be applied without powerful
infra.)
4. Automatic network interfaces configuration. Ubuntu installer smartly apply
network configuration to /etc/network/interfaces, so does CentOS’s anaconda.
18. Architecture
Preseed/Kickstart V.S. Image clone
• Preseeding is slow. Although installer could utilize yum/apt mirror to speed
up package downloading, the entire retrieve-prepare-configure cycle can’t be
optimized further.
• Image clone is suitable for creating VM.(Xen, LXC, etc) But it is too dump
to do anything intelligence.
But we want the best of both world!
Solution:
n_preseed = normailize(uuid.preseed, uuid.hardware)
n_preseed.exists?
n_preseed.clone(server_ip, uuid)
else
install(uuid)
19. Architecture
1. Yum/Apt mirror ensure 99% cache hit, all the packages are
pulled from LAN. Local master only maintain cache.
2. Why not directly mirror upstream repo.?
1. The bandwidth of upstream mirror is likely to fluctuate(e.g.,
us.archieve.ubuntu.com)
2. Most packages will never be downloaded. In fact, the standard
installation of CentOS 6.0 only needs less than 380 packages where a
full fledged repo contains 15K. (2.5%)
3. Repo. implementations
1. Yum: nginx error_page + proxy_pass + ppull.rb
upstream mirror: mirrors.sdo.com (Why not proxy_cache? Because nginx
has some issue with range-request when proxy_cache is enabled.)
2. Apt: apt-cacher-ng
upstream mirror: mirror.lupaworld.com
20. The Matrix
Ubuntu Ubuntu CentOS CentOS RHEL RHEL Arch
FreeBSD Gentoo Fedora Debian
10.04 11.04 5.6 6.0 5.6 6.0 Linux
IBM
x3550 Y Y Y Y
HP
Prolian
t DL360
G5
IBM
x3550
M2
Dell
PowerEd
ge R610
HP
Prolian
t DL385
G2
IBM
BladeCe
nter
LS22
• Y means both i386 and amd64 is passed
• Y* means M[ij] needs extra configuration
21. Architecture
1. Why hardware has dependency on OS distro.?
Every OS distro. may bring surprise. e.g.
Ubuntu-11.04(codename natty)’s radeon card drivers is
incompatible with IBM x3550. You got kernel panic after
installation.
2. What’s the purpose to support all Linux distro.?
• We want Total World Domination
• NML is about mechanism not policy
• Linode supports all distro. on Xen! Our task is easier.
3. Is it time-consuming to support all linux distro.?
Just do it.