SlideShare una empresa de Scribd logo
1 de 53
Descargar para leer sin conexión
Architecture	
  of	
  a	
  	
  
Next-­‐Generation	
  Parallel	
  File	
  System	
  
-­‐  Introduction	
  
-­‐  Whats	
  in	
  the	
  code	
  now	
  
-­‐  futures	
  
Agenda	
  
An	
  introduction	
  
What	
  is	
  OrangeFS?	
  
•  OrangeFS	
  is	
  a	
  next	
  generation	
  Parallel	
  File	
  System	
  
•  Based	
  on	
  PVFS	
  
•  Distributes	
  file	
  data	
  across	
  multiple	
  file	
  servers	
  
leveraging	
  any	
  block	
  level	
  file	
  system.	
  
•  Distributed	
  Meta	
  Data	
  across	
  1	
  to	
  all	
  storage	
  servers	
  	
  
•  Supports	
  simultaneous	
  access	
  by	
  multiple	
  clients,	
  
including	
  Windows	
  using	
  the	
  PVFS	
  protocol	
  Directly	
  
•  Works	
  w/	
  standard	
  kernel	
  releases	
  and	
  does	
  not	
  require	
  
custom	
  kernel	
  patches	
  
•  Easy	
  to	
  install	
  and	
  maintain	
  
Why	
  Parallel	
  File	
  System?	
  
HPC	
  –	
  Data	
  Intensive	
   Parallel	
  (PVFS)	
  Protocol	
  
•  Large	
  datasets	
  
•  Checkpointing	
  
•  Visualization	
  
•  Video	
  
•  BigData	
  
	
  
Unstructured	
  Data	
  Silos	
   Interfaces	
  to	
  Match	
  Problems	
  
•  Unify	
  Dispersed	
  File	
  Systems	
  
•  Simplify	
  Storage	
  Leveling	
  
§  Multidimensional	
  arrays	
  
§  	
  typed	
  data	
  	
  
§  portable	
  formats	
  
Original	
  PVFS	
  Design	
  Goals	
  
§  Scalable	
  
§  Configurable	
  file	
  striping	
  
§  Non-­‐contiguous	
  I/O	
  patterns	
  
§  Eliminates	
  bottlenecks	
  in	
  I/O	
  path	
  
§  Does	
  not	
  need	
  locks	
  for	
  metadata	
  ops	
  
§  Does	
  not	
  need	
  locks	
  for	
  non-­‐conflicting	
  applications	
  
§  Usability	
  
§  Very	
  easy	
  to	
  install,	
  small	
  VFS	
  kernel	
  driver	
  
§  Modular	
  design	
  for	
  disk,	
  network,	
  etc	
  
§  Easy	
  to	
  extend	
  -­‐>	
  Hundreds	
  of	
  Research	
  Projects	
  have	
  
used	
  it,	
  including	
  dissertations,	
  thesis,	
  etc…	
  
OrangeFS	
  Philosophy	
  
	
  
•  Focus	
  on	
  a	
  Broader	
  Set	
  of	
  Applications	
  
•  Customer	
  &	
  Community	
  Focused	
  
•  (>300	
  Member	
  Strong	
  Community	
  &	
  Growing)	
  
•  Open	
  Source	
  
•  Commercially	
  Viable	
  
•  Enable	
  Research	
  
	
  
Configurability	
  
Performance	
  
Consistency	
  Reliability	
  
System	
  Architecture	
  
•  OrangeFS	
  servers	
  manage	
  objects	
  
•  Objects	
  map	
  to	
  a	
  specific	
  server	
  
•  Objects	
  store	
  data	
  or	
  metadata	
  
•  Request	
  protocol	
  specifies	
  operations	
  on	
  one	
  or	
  
more	
  objects	
  	
  
•  OrangeFS	
  object	
  implementation	
  
•  DB	
  for	
  indexing	
  key/value	
  data	
  
•  Local	
  block	
  file	
  system	
  for	
  data	
  stream	
  of	
  bytes	
  
Current	
  Architecture	
  
Client	
   Server	
  
1994-­‐2004	
  
Design	
  and	
  Development	
  at	
  
CU	
  Dr.	
  Ligon	
  +	
  ANL	
  (CU	
  
Graduates)	
  
Primary	
  Maint	
  &	
  Development	
  
ANL	
  (CU	
  Graduates)	
  +	
  Community	
  
2004-­‐2010	
  
2007-­‐2010	
   New	
  PVFS	
  Branch	
  
SC10	
  (fall	
  2010)	
  
2015	
  
Announced	
  with	
  community	
  and	
  is	
  now	
  	
  
Mainline	
  of	
  future	
  development	
  as	
  of	
  2.8.4	
  	
  
Spring	
  2012	
  
New	
  Development	
  focused	
  on	
  
a	
  broader	
  set	
  of	
  problems	
  
SC11	
  (fall	
  2011)	
  
Performance	
  improvements,	
  	
  
Direct	
  Lib	
  +	
  Cache	
  
Stability,	
  WebDAV,	
  S3	
  
PVFS2	
  
PVFS	
  
Improved	
  MD,	
  Stability,	
  
Server	
  Side	
  Operations,	
  
Newer	
  Kernels,	
  Testing	
  
Windows	
  Client,	
  Stability,	
  
Replicate	
  on	
  Immutable	
  
2.8.6	
  +	
  Webpack	
  
2.8.5	
  +	
  Win	
  
Support	
  and	
  Targeted	
  Development	
  Services	
  
Initially	
  Offered	
  by	
  Omnibond	
  
OrangeFS	
  	
  3.0	
  
Summer	
  2014	
  
Distributed	
  Dir	
  MD,	
  
Capability	
  based	
  security	
  2.9.0	
  
Winter	
  2013	
  
Performance	
  improvements,	
  	
  
Stability,	
  2.8.7	
  +	
  Webpack	
  
Spring	
  2014	
  
Performance	
  improvements,	
  Stability,	
  shared	
  mmap,	
  multi	
  TCP/IP	
  Server	
  
Homing,	
  Hadoop	
  	
  MapReduce,	
  user	
  lib	
  fixes,	
  new	
  spec	
  file	
  for	
  RPMS	
  +	
  DKMS	
  2.8.8	
  +	
  Webpack	
  
Available	
  in	
  the	
  AWS	
  Marketplace	
  
Replicated	
  MD,	
  File	
  Data,	
  128	
  bit	
  UUID	
  for	
  File	
  Handles,	
  Parallel	
  Background	
  
Processes,	
  web	
  based	
  Mgt	
  Ui,	
  self	
  healing	
  processes,	
  data	
  balancing	
  
In	
  the	
  Code	
  now	
  
Server	
  to	
  Server	
  Communications	
  (2.8.5)	
  
	
  
Traditional	
  Metadata	
  
Operation	
  
	
  Create	
  request	
  causes	
  client	
  to	
  
communicate	
  with	
  all	
  servers	
  O(p)	
  
Scalable	
  Metadata	
  
Operation	
  
	
  
Create	
  request	
  communicates	
  with	
  a	
  single	
  
server	
  which	
  in	
  turn	
  communicates	
  with	
  
other	
  servers	
  using	
  a	
  tree-­‐based	
  protocol	
  
O(log	
  p)	
  
Mid	
  
Client	
  
Serv	
  
App	
  
Mid	
  Mid	
  
Client	
  
Serv	
  
Client	
  
Serv	
  
App	
  
Network	
  
App	
  
Mid	
  
Client	
  
Serv	
  
App	
  
Mid	
  Mid	
  
Client	
  
Serv	
  
Client	
  
Serv	
  
App	
  
Network	
  
App	
  
Recent	
  Additions	
  (2.8.5)	
  
SSD	
  Metadata	
  Storage	
   Replicate	
  on	
  Immutable	
  (file	
  based)	
  
Windows	
  Client	
  
Supports	
  Windows	
  32/64	
  bit	
  
Server	
  2008,	
  R2,	
  Vista,	
  7	
  
Direct	
  Access	
  Interface	
  	
  (2.8.6)	
  
•  Implements:	
  
•  POSIX	
  system	
  calls	
  
•  Stdio	
  library	
  calls	
  
•  Parallel	
  extensions	
  
•  Noncontiguous	
  I/O	
  
•  Non-­‐blocking	
  I/O	
  
•  MPI-­‐IO	
  library	
  
•  Found	
  more	
  boundary	
  
conditions	
  fixed	
  in	
  
upcoming	
  2.8.7	
  
App	
  
Kernel	
  
PVFS	
  lib	
  
Client	
  Core	
  
Direct	
  lib	
  
PVFS	
  lib	
  
Kernel	
  
App	
  
IB	
  
TCP	
  
File	
  System	
  File	
  System	
  
File	
  System	
  
Direct	
  Interface	
  Client	
  Caching	
  	
  (2.8.6)	
  
	
  
•  Direct	
  Interface	
  enables	
  Multi-­‐Process	
  
Coherent	
  Client	
  Caching	
  for	
  a	
  single	
  client	
  
File	
  System	
  
File	
  System	
  
Client	
  Application	
  
Direct	
  Interface	
  
	
  
Client	
  Cache	
  
File	
  System	
  
WebDAV	
  (2.8.6	
  webpack)	
  
PVFS	
  Protocol	
  
OrangeFS	
  
Apache	
  
•  Supports	
  DAV	
  protocol	
  and	
  tested	
  with	
  the	
  Litmus	
  DAV	
  test	
  
suite	
  
•  Supports	
  DAV	
  cooperative	
  locking	
  in	
  metadata	
  
	
  
S3	
  (2.8.6	
  webpack)	
  
PVFS	
  Protocol	
  
OrangeFS	
  
Apache	
  
•  Tested	
  using	
  s3cmd	
  client	
  
•  Files	
  accessible	
  via	
  other	
  access	
  methods	
  
•  Containers	
  are	
  Directories	
  
•  Accounting	
  Pieces	
  not	
  implimented	
  
Summary	
  -­‐	
  Recently	
  Added	
  to	
  OrangeFS	
  
•  In	
  2.8.3	
  
•  Server-­‐to-­‐Server	
  Communication	
  
•  SSD	
  Metadata	
  Storage	
  
•  Replicate	
  on	
  Immutable	
  
•  2.8.4,	
  2.8.5	
  (fixes,	
  support	
  for	
  newer	
  kernels)	
  
•  Windows	
  Client	
  
•  2.8.6	
  –	
  Performance,	
  Fixes,	
  IB	
  updates	
  
•  Direct	
  Access	
  Libraries	
  (initial	
  release)	
  
•  preload	
  library	
  for	
  applications,	
  Including	
  Optional	
  Client	
  
Cache	
  
•  Webpack	
  
•  WebDAV	
  (with	
  file	
  locking),	
  S3	
  
Available	
  on	
  the	
  	
  Amazon	
  AWS	
  Marketplace	
  and	
  brought	
  to	
  you	
  by	
  Omnibond	
  
OrangeFS	
  
Instance	
  
Unified High Performance File System
DynamoDB	
  
EBS	
  
Volumes	
  
OrangeFS	
  on	
  AWS	
  Marketplace	
  
In	
  2.8.8	
  (Just	
  Released)	
  
Hadoop	
  JNI	
  Interface	
  (2.8.8)	
  
•  OrangeFS	
  Java	
  Native	
  
Interface	
  
•  Extension	
  of	
  Hadoop	
  File	
  
System	
  Class	
  –>JNI	
  
•  Buffering	
  
•  Distribution	
  
•  Fast	
  PVFS	
  Protocol	
  
for	
  Remote	
  
Configuration	
  
PVFS	
  Protocol	
  
Additional	
  Items(2.8.8)	
  
•  Updated	
  user	
  lib	
  
•  Shared	
  mmap	
  support	
  in	
  kernel	
  module	
  
•  Support	
  for	
  kernels	
  up	
  to	
  3.11	
  
•  Multi-­‐homing	
  servers	
  over	
  IP	
  
•  Clients	
  can	
  access	
  server	
  over	
  multiple	
  interfaces	
  (say	
  
clients	
  on	
  IPoIB	
  +clients	
  on	
  IPoEthernet	
  +clients	
  on	
  IPoMx	
  
•  Enterprise	
  Installers	
  (Coming	
  Shortly)	
  
•  Client	
  (with	
  DKMS	
  for	
  Kernel	
  Module)	
  
•  Server	
  	
  
•  Devel	
  
Performance	
  
Scaling	
  Tests	
  
16	
  Storage	
  Servers	
  with	
  2	
  LVM’d	
  5+1	
  RAID	
  sets	
  were	
  tested	
  with	
  up	
  
to	
  32	
  clients,	
  with	
  read	
  performance	
  reaching	
  nearly	
  12GB/s	
  and	
  
write	
  performance	
  reaching	
  nearly	
  8GB/s.	
  
MapReduce	
  over	
  OrangeFS	
  
•  8	
  Dell	
  R720	
  Servers	
  Connected	
  with	
  10Gb/s	
  Ethernet	
  
•  Remote	
  Case	
  adds	
  an	
  additional	
  8	
  Identical	
  Servers	
  and	
  
does	
  all	
  OrangeFS	
  work	
  Remotely	
  and	
  only	
  Local	
  work	
  is	
  
done	
  on	
  Compute	
  Node	
  (Traditional	
  HPC	
  Model)	
  
•  *25%	
  improvement	
  with	
  OrangeFS	
  running	
  Remotely	
  	
  
MapReduce	
  over	
  OrangeFS	
  
•  8	
  Dell	
  R720	
  Servers	
  Connected	
  with	
  10Gb/s	
  Ethernet	
  
•  Remote	
  Clients	
  are	
  R720s	
  with	
  single	
  SAS	
  disks	
  for	
  local	
  
data	
  (vs.	
  12	
  disk	
  arrays	
  in	
  the	
  previous	
  test).	
  
OrangeFS	
  
Clients	
  
SC13	
  Demo	
  Overview	
  	
  
OrangeFS	
  
Clients	
  
16	
  Dell	
  R720	
  
OrangeFS	
  
Servers	
  
SC13	
  Floor	
  
•  Clemson	
  
•  USC	
  
•  I2	
  
•  Omnibond	
  
I2	
  Innovation	
  
Platform	
  
100Gb/s	
  
Sc13	
  WAN	
  Performance	
  
Multiple	
  Concurrent	
  Client	
  File	
  Creates	
  over	
  PVFS	
  protocol	
  (nullio)	
  
For	
  2.9	
  (summer	
  2014)	
  
Distributed	
  Directory	
  Metadata	
  (2.9.0)	
  
DirEnt1	
  
DirEnt2	
  
DirEnt3	
  
DirEnt4	
  
DirEnt5	
  
DirEnt6	
  
DirEnt1	
  
DirEnt5	
  
DirEnt3	
  
DirEnt2	
  
DirEnt6	
  
DirEnt4	
  
Server0	
  
Server1	
  
Server2	
  
Server3	
  
Extensible Hashing	
  
u  State	
  Management	
  based	
  on	
  Giga+	
  
u  Garth	
  Gibson,	
  CMU	
  
u  Improves	
  access	
  times	
  for	
  directories	
  with	
  
a	
  very	
  large	
  number	
  of	
  entries	
  
Cert	
  or	
  Credential	
  
Signed	
  Capability	
  I/O	
  
Signed	
  Capability	
  
Signed	
  Capability	
  I/O	
  
Signed	
  Capability	
  I/O	
  
OpenSSL	
  
PKI	
  
•  3	
  Security	
  Modes	
  
•  Basic	
  –	
  OrangeFS/PVFS	
  Classic	
  Mode	
  
•  Key-­‐Based	
  –	
  Keys	
  are	
  used	
  to	
  authorize	
  clients	
  for	
  use	
  with	
  the	
  FS	
  
•  User	
  Certificate	
  Based	
  with	
  LDAP	
  –	
  user	
  certs	
  are	
  used	
  for	
  access	
  to	
  
the	
  file	
  system	
  and	
  are	
  generated	
  based	
  on	
  LDAP	
  uid/gid	
  info	
  
For	
  v3	
  
Replication	
  /	
  Redundancy	
  
	
  •  Redundant	
  Metadata	
  
•  seamless	
  recovery	
  after	
  a	
  failure	
  
•  Redundant	
  objects	
  from	
  root	
  directory	
  down	
  
•  Configurable	
  
•  Redundant	
  Data	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  Update	
  mode	
  (real	
  time,	
  on	
  close,	
  on	
  immutable,	
  none)	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  Configurable	
  Number	
  of	
  Replicas	
  
•  Real	
  Time	
  “forked	
  flow”	
  work	
  shows	
  little	
  overhead	
  
•  Replicate	
  on	
  Close	
  
•  Replicate	
  to	
  external	
  (like	
  LTFS)	
  
•  Looking	
  at	
  supporting	
  HSM	
  option	
  to	
  external	
  	
  
	
  	
  	
  	
  (no	
  local	
  replica)	
  
•  Emphasis	
  on	
  continuous	
  operation	
  
OrangeFS	
  3.0	
  
•  An	
  OID	
  (object	
  identifier)	
  is	
  a	
  128-­‐bit	
  UUID	
  that	
  
is	
  unique	
  to	
  the	
  data-­‐space	
  
•  An	
  SID	
  (server	
  identifier)	
  is	
  a	
  128-­‐bit	
  UUID	
  that	
  
is	
  unique	
  to	
  each	
  server.	
  
•  No	
  more	
  than	
  one	
  copy	
  of	
  a	
  given	
  data-­‐space	
  
can	
  exist	
  on	
  any	
  server	
  
•  The	
  (OID,	
  SID)	
  tuple	
  is	
  unique	
  within	
  the	
  file	
  
system.	
  
•  (OID,	
  SID1),	
  	
  (OID,	
  SID2),	
  (OID,	
  SID3)	
  are	
  copies	
  
of	
  the	
  object	
  identifier	
  on	
  different	
  servers.	
  
Handles	
  -­‐>	
  UUIDs	
  
OrangeFS	
  3.0	
  
•  In	
  an	
  Exascale	
  environment	
  with	
  the	
  potential	
  for	
  	
  
thousands	
  of	
  I/O	
  servers,	
  it	
  will	
  no	
  longer	
  be	
  feasible	
  
for	
  each	
  server	
  to	
  know	
  about	
  all	
  other	
  servers.	
  
•  Servers	
  Discovery	
  
•  Will	
  know	
  a	
  subset	
  of	
  their	
  neighbors	
  at	
  startup	
  (or	
  may	
  
be	
  cached	
  from	
  previous	
  startups).	
  	
  	
  Similar	
  to	
  DNS	
  
domains.	
  
•  Servers	
  will	
  learn	
  about	
  unknown	
  servers	
  on	
  an	
  as	
  needed	
  
basis	
  and	
  cache	
  them.	
  	
  Similar	
  to	
  DNS	
  query	
  mechanisms	
  
(root	
  servers,	
  authoritative	
  domain	
  servers).	
  
•  SID	
  Cache,	
  in	
  memory	
  db	
  to	
  store	
  server	
  attributes	
  
Server	
  Location	
  /	
  SID	
  Mgt	
  
OrangeFS	
  3.0	
  
Policy	
  Based	
  Location	
  
•  User	
  defined	
  attributes	
  for	
  servers	
  and	
  
clients	
  
•  Stored	
  in	
  SID	
  cache	
  
•  Policy	
  is	
  used	
  for	
  data	
  location,	
  replication	
  
location	
  and	
  multi-­‐tenant	
  support	
  	
  
•  Completely	
  Flexible	
  
•  Rack	
  
•  Row	
  
•  App	
  
•  Region	
  
	
  
OrangeFS	
  3.0	
  
•  Modular	
  infrastructure	
  to	
  easily	
  build	
  
background	
  parallel	
  processes	
  for	
  the	
  file	
  
system	
  
Used	
  for:	
  
• 	
   Gathering	
  Stats	
  for	
  Monitoring	
  
•  Usage	
  Calculation	
  (can	
  be	
  leveraged	
  for	
  Directory	
  
Space	
  Restrictions,	
  chargebacks)	
  
•  Background	
  safe	
  FSCK	
  processing	
  (can	
  mark	
  bad	
  
items	
  in	
  MD)	
  
•  Background	
  Checksum	
  comparisons	
  	
  
•  Etc…	
  
	
  
Background	
  Parallel	
  Processing	
  Infrastructure	
  
(3.0)	
  
Admin	
  REST	
  interface	
  /	
  Admin	
  UI	
  
	
  
PVFS	
  Protocol	
  
OrangeFS	
  
Apache	
  
REST	
  
(3.0)	
  
Data	
  Migration	
  /	
  Mgt	
  
•  Built	
  on	
  Redundancy	
  &	
  DBG	
  processes	
  
•  Migrate	
  objects	
  between	
  servers	
  
•  De-­‐populate	
  a	
  server	
  going	
  out	
  of	
  service	
  
•  Populate	
  a	
  newly	
  activated	
  server	
  (HW	
  lifecycle)	
  
•  Moving	
  computation	
  to	
  data	
  
•  Hierarchical	
  storage	
  
•  Use	
  existing	
  metadata	
  services	
  
•  Possible	
  -­‐	
  Directory	
  Hierarchy	
  Cloning	
  
•  	
  Copy	
  on	
  Write	
  (Dev,	
  QA,	
  Prod	
  environments	
  with	
  high	
  %	
  data	
  
overlap)	
  
OrangeFS	
  3.x	
  
Hierarchical	
  Data	
  Management	
  
Archive	
  
Intermediate	
  
Storage	
  
NFS	
  
Remote	
  
Systems	
  
exceed,	
  OSG,	
  
Lustre,	
  GPFS,	
  	
  
Ceph,	
  Gluster	
  
HPC	
  
OrangeFS	
  
Metadata	
  
OrangeFS	
  
Users	
  
OrangeFS	
  3.x	
  
Attribute	
  Based	
  Metadata	
  Search	
  
	
  
•  Client	
  tags	
  files	
  with	
  Keys/Values	
  	
  
•  	
  Keys/Values	
  indexed	
  on	
  Metadata	
  Servers	
  
•  	
  Clients	
  query	
  for	
  files	
  based	
  on	
  Keys/Values	
  
•  	
  Returns	
  file	
  handles	
  with	
  options	
  for	
  filename	
  
and	
  path	
  
Key/Value	
  Parallel	
  Query	
  
Data	
  
Data	
  
File	
  Access	
  
OrangeFS	
  3.x	
  
Beyond	
  OrangeFS	
  NEXT	
  
Extend	
  Capability	
  based	
  security	
  
•  Enable	
  certificate	
  level	
  access	
  (in	
  process)	
  
•  Federated	
  access	
  capable	
  
•  Can	
  be	
  integrated	
  with	
  rules	
  based	
  access	
  
control	
  
•  Department	
  x	
  in	
  company	
  y	
  can	
  share	
  with	
  
Department	
  q	
  in	
  company	
  z	
  	
  
•  rules	
  and	
  roles	
  establish	
  the	
  relationship	
  
•  Each	
  company	
  manages	
  their	
  own	
  control	
  of	
  who	
  is	
  in	
  
the	
  company	
  and	
  in	
  department	
  
SDN	
  -­‐	
  OpenFlow	
  
•  Working	
  with	
  OF	
  research	
  team	
  at	
  CU	
  
•  OF	
  separates	
  the	
  control	
  plane	
  from	
  delivery,	
  
gives	
  ability	
  to	
  control	
  network	
  with	
  SW	
  
•  Looking	
  at	
  bandwidth	
  optimization	
  
leveraging	
  OF	
  and	
  OrangeFS	
  
ParalleX	
  
ParalleX	
  is	
  a	
  new	
  parallel	
  execution	
  model	
  
•  Key	
  components	
  are:	
  
•  Asynchronous	
  Global	
  Address	
  Space	
  (AGAS)	
  
•  Threads	
  
•  Parcels	
  (message	
  driven	
  instead	
  of	
  message	
  passing)	
  
•  Locality	
  	
  
•  Percolation	
  	
  
•  Synchronization	
  primitives	
  
•  High	
  Performance	
  ParalleX	
  (HPX)	
  library	
  
implementation	
  written	
  in	
  C++	
  
PXFS	
  
•  Parallel	
  I/O	
  for	
  ParalleX	
  based	
  on	
  PVFS	
  
•  Common	
  themes	
  with	
  OrangeFS	
  Next	
  
•  Primary	
  objective:	
  unification	
  of	
  ParalleX	
  and	
  storage	
  
name	
  spaces.	
  
•  Integration	
  of	
  AGAS	
  and	
  storage	
  metadata	
  subsystems	
  
•  Persistent	
  object	
  model	
  
•  Extends	
  ParalleX	
  with	
  a	
  number	
  of	
  IO	
  concepts	
  
•  Replication	
  
•  Metadata	
  
•  Extending	
  IO	
  with	
  ParalleX	
  concepts	
  
•  Moving	
  work	
  to	
  data	
  
•  Local	
  synchronization	
  
•  Effort	
  with	
  LSU,	
  Clemson,	
  and	
  Indiana	
  U.	
  
•  Walt	
  Ligon,	
  Thomas	
  Sterling	
  
Community	
  
Johns	
  Hopkins	
  OrangeFS	
  Selection	
  
•  JHU	
  -­‐	
  HLTCOE	
  Selected	
  OrangeFS	
  
•  After	
  evaluating:	
  Ceph,	
  GlusterFS,	
  Lustre	
  and	
  
OrangeFS	
  
“Leveraging	
  OrangeFS	
  for	
  the	
  
parallel	
  filesystem,	
  the	
  system	
  as	
  a	
  whole	
  is	
  capable	
  
of	
  delivering	
  30GB/s	
  write,	
  46GB/s	
  read,	
  and	
  be-­‐	
  
tween	
  37,260-­‐237,180	
  IOPS	
  of	
  performance.	
  The	
  
variation	
  in	
  IOPS	
  performance	
  is	
  dependent	
  on	
  the	
  
file	
  size	
  and	
  number	
  of	
  bytes	
  written	
  per	
  commit	
  as	
  
documented	
  in	
  the	
  Test	
  Results	
  section.”*	
  
*	
  http://hltcoe.jhu.edu/uploads/publications/papers/14662_slides.pdf	
  
“The	
  final	
  system	
  design	
  rep-­‐	
  
resents	
  a	
  2,775%	
  increase	
  in	
  read	
  performance	
  and	
  
a	
  1,763-­‐11,759%	
  increase	
  in	
  IOPS”*	
  
Learning	
  More	
  
•  www.orangefs.org	
  web	
  site	
  
•  Releases	
  
•  Documentation	
  
•  Wiki	
  
•  pvfs2-­‐users@beowulf-­‐underground.org	
  
•  Support	
  for	
  users	
  
•  pvfs2-­‐developers@beowulf-­‐underground.org	
  
•  Support	
  for	
  developers	
  
Support	
  &	
  Development	
  Services	
  
•  www.orangefs.com	
  &	
  www.omnibond.com	
  
•  Professional	
  Support	
  &	
  Development	
  team	
  
•  Buy	
  into	
  the	
  project	
  
	
  
 
Intelligent	
  
Transportation	
  
Solutions	
  
	
  
Identity	
  Manager	
  
Drivers	
  &	
  Sentinel	
  
Connectors	
  
	
  
	
  
Parallel	
  Scale-­‐Out	
  
Storage	
  Software	
  
	
  
Social	
  Media	
  
Interaction	
  System	
  
Omnibond	
  Info	
  
Computer	
  Vision	
   Enterprise	
   Personal	
  
Solution	
  Areas	
  
Insert	
  Discussion	
  Here	
  

Más contenido relacionado

La actualidad más candente

IBM SONAS and the Cloud Storage Taxonomy
IBM SONAS and the Cloud Storage TaxonomyIBM SONAS and the Cloud Storage Taxonomy
IBM SONAS and the Cloud Storage TaxonomyTony Pearson
 
Hadoop HDFS NameNode HA
Hadoop HDFS NameNode HAHadoop HDFS NameNode HA
Hadoop HDFS NameNode HAHanborq Inc.
 
Maginatics Cloud Storage Platform - MCSP 3.0 Technical Highlights
Maginatics Cloud Storage Platform - MCSP 3.0 Technical HighlightsMaginatics Cloud Storage Platform - MCSP 3.0 Technical Highlights
Maginatics Cloud Storage Platform - MCSP 3.0 Technical HighlightsMaginatics
 
Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - RedisStorage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - RedisSameer Tiwari
 
Red Hat Storage - Introduction to GlusterFS
Red Hat Storage - Introduction to GlusterFSRed Hat Storage - Introduction to GlusterFS
Red Hat Storage - Introduction to GlusterFSGlusterFS
 
Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...
Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...
Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...Maginatics
 
Directory Write Leases in MagFS
Directory Write Leases in MagFSDirectory Write Leases in MagFS
Directory Write Leases in MagFSMaginatics
 
CBlocks - Posix compliant files systems for HDFS
CBlocks - Posix compliant files systems for HDFSCBlocks - Posix compliant files systems for HDFS
CBlocks - Posix compliant files systems for HDFSDataWorks Summit
 
Wheeler w 0450_linux_file_systems1
Wheeler w 0450_linux_file_systems1Wheeler w 0450_linux_file_systems1
Wheeler w 0450_linux_file_systems1sprdd
 
Comparison between mongo db and cassandra using ycsb
Comparison between mongo db and cassandra using ycsbComparison between mongo db and cassandra using ycsb
Comparison between mongo db and cassandra using ycsbsonalighai
 
Interactive Hadoop via Flash and Memory
Interactive Hadoop via Flash and MemoryInteractive Hadoop via Flash and Memory
Interactive Hadoop via Flash and MemoryChris Nauroth
 
Gluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFSGluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFSGlusterFS
 
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBBenchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBAthiq Ahamed
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
Vijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresVijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresmkorremans
 
ORM and distributed caching
ORM and distributed cachingORM and distributed caching
ORM and distributed cachingaragozin
 

La actualidad más candente (20)

IBM SONAS and the Cloud Storage Taxonomy
IBM SONAS and the Cloud Storage TaxonomyIBM SONAS and the Cloud Storage Taxonomy
IBM SONAS and the Cloud Storage Taxonomy
 
Hadoop HDFS NameNode HA
Hadoop HDFS NameNode HAHadoop HDFS NameNode HA
Hadoop HDFS NameNode HA
 
Maginatics Cloud Storage Platform - MCSP 3.0 Technical Highlights
Maginatics Cloud Storage Platform - MCSP 3.0 Technical HighlightsMaginatics Cloud Storage Platform - MCSP 3.0 Technical Highlights
Maginatics Cloud Storage Platform - MCSP 3.0 Technical Highlights
 
Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - RedisStorage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
 
Red Hat Storage - Introduction to GlusterFS
Red Hat Storage - Introduction to GlusterFSRed Hat Storage - Introduction to GlusterFS
Red Hat Storage - Introduction to GlusterFS
 
Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...
Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...
Maginatics @ SDC 2013: Architecting An Enterprise Storage Platform Using Obje...
 
Hdfs architecture
Hdfs architectureHdfs architecture
Hdfs architecture
 
Directory Write Leases in MagFS
Directory Write Leases in MagFSDirectory Write Leases in MagFS
Directory Write Leases in MagFS
 
CBlocks - Posix compliant files systems for HDFS
CBlocks - Posix compliant files systems for HDFSCBlocks - Posix compliant files systems for HDFS
CBlocks - Posix compliant files systems for HDFS
 
Wheeler w 0450_linux_file_systems1
Wheeler w 0450_linux_file_systems1Wheeler w 0450_linux_file_systems1
Wheeler w 0450_linux_file_systems1
 
Inexpensive storage
Inexpensive storageInexpensive storage
Inexpensive storage
 
Comparison between mongo db and cassandra using ycsb
Comparison between mongo db and cassandra using ycsbComparison between mongo db and cassandra using ycsb
Comparison between mongo db and cassandra using ycsb
 
Interactive Hadoop via Flash and Memory
Interactive Hadoop via Flash and MemoryInteractive Hadoop via Flash and Memory
Interactive Hadoop via Flash and Memory
 
DAOS Middleware overview
DAOS Middleware overviewDAOS Middleware overview
DAOS Middleware overview
 
Gluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFSGluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFS
 
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBBenchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
Vijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresVijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-features
 
Dfs in iaa_s
Dfs in iaa_sDfs in iaa_s
Dfs in iaa_s
 
ORM and distributed caching
ORM and distributed cachingORM and distributed caching
ORM and distributed caching
 

Similar a Architecture of a Next-Generation Parallel File System

State of the Container Ecosystem
State of the Container EcosystemState of the Container Ecosystem
State of the Container EcosystemVinay Rao
 
Spectrum Scale Unified File and Object with WAN Caching
Spectrum Scale Unified File and Object with WAN CachingSpectrum Scale Unified File and Object with WAN Caching
Spectrum Scale Unified File and Object with WAN CachingSandeep Patil
 
Software Defined Analytics with File and Object Access Plus Geographically Di...
Software Defined Analytics with File and Object Access Plus Geographically Di...Software Defined Analytics with File and Object Access Plus Geographically Di...
Software Defined Analytics with File and Object Access Plus Geographically Di...Trishali Nayar
 
1.2 build cloud_fabric_final
1.2 build cloud_fabric_final1.2 build cloud_fabric_final
1.2 build cloud_fabric_finalPaulo Freitas
 
Ceph Day London 2014 - The current state of CephFS development
Ceph Day London 2014 - The current state of CephFS development Ceph Day London 2014 - The current state of CephFS development
Ceph Day London 2014 - The current state of CephFS development Ceph Community
 
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFGestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFSUSE Italy
 
Service fabric and azure service fabric mesh
Service fabric and azure service fabric meshService fabric and azure service fabric mesh
Service fabric and azure service fabric meshMikkel Mørk Hegnhøj
 
VMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphereVMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphereVMworld
 
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and InfrastrctureRevolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and Infrastrcturesabnees
 
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...VMworld
 
Nordic infrastructure Conference 2017 - SQL Server on Linux Overview
Nordic infrastructure Conference 2017 - SQL Server on Linux OverviewNordic infrastructure Conference 2017 - SQL Server on Linux Overview
Nordic infrastructure Conference 2017 - SQL Server on Linux OverviewTravis Wright
 
Introduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RIntroduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RSimon Huang
 
What's New in .Net 4.5
What's New in .Net 4.5What's New in .Net 4.5
What's New in .Net 4.5Malam Team
 
Gs08 modernize your data platform with sql technologies wash dc
Gs08 modernize your data platform with sql technologies   wash dcGs08 modernize your data platform with sql technologies   wash dc
Gs08 modernize your data platform with sql technologies wash dcBob Ward
 
Big Data Architecture Workshop - Vahid Amiri
Big Data Architecture Workshop -  Vahid AmiriBig Data Architecture Workshop -  Vahid Amiri
Big Data Architecture Workshop - Vahid Amiridatastack
 
Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSSteve Wong
 
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...DataWorks Summit/Hadoop Summit
 

Similar a Architecture of a Next-Generation Parallel File System (20)

State of the Container Ecosystem
State of the Container EcosystemState of the Container Ecosystem
State of the Container Ecosystem
 
Spectrum Scale Unified File and Object with WAN Caching
Spectrum Scale Unified File and Object with WAN CachingSpectrum Scale Unified File and Object with WAN Caching
Spectrum Scale Unified File and Object with WAN Caching
 
Software Defined Analytics with File and Object Access Plus Geographically Di...
Software Defined Analytics with File and Object Access Plus Geographically Di...Software Defined Analytics with File and Object Access Plus Geographically Di...
Software Defined Analytics with File and Object Access Plus Geographically Di...
 
1.2 build cloud_fabric_final
1.2 build cloud_fabric_final1.2 build cloud_fabric_final
1.2 build cloud_fabric_final
 
Ceph Day London 2014 - The current state of CephFS development
Ceph Day London 2014 - The current state of CephFS development Ceph Day London 2014 - The current state of CephFS development
Ceph Day London 2014 - The current state of CephFS development
 
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFGestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
 
Service fabric and azure service fabric mesh
Service fabric and azure service fabric meshService fabric and azure service fabric mesh
Service fabric and azure service fabric mesh
 
VMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphereVMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphere
 
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and InfrastrctureRevolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
 
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
 
Nordic infrastructure Conference 2017 - SQL Server on Linux Overview
Nordic infrastructure Conference 2017 - SQL Server on Linux OverviewNordic infrastructure Conference 2017 - SQL Server on Linux Overview
Nordic infrastructure Conference 2017 - SQL Server on Linux Overview
 
Cl115
Cl115Cl115
Cl115
 
Introduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RIntroduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3R
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
 
What's New in .Net 4.5
What's New in .Net 4.5What's New in .Net 4.5
What's New in .Net 4.5
 
TechTalk: Connext DDS 5.2.
TechTalk: Connext DDS 5.2.TechTalk: Connext DDS 5.2.
TechTalk: Connext DDS 5.2.
 
Gs08 modernize your data platform with sql technologies wash dc
Gs08 modernize your data platform with sql technologies   wash dcGs08 modernize your data platform with sql technologies   wash dc
Gs08 modernize your data platform with sql technologies wash dc
 
Big Data Architecture Workshop - Vahid Amiri
Big Data Architecture Workshop -  Vahid AmiriBig Data Architecture Workshop -  Vahid Amiri
Big Data Architecture Workshop - Vahid Amiri
 
Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OS
 
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
 

Más de Great Wide Open

The Little Meetup That Could
The Little Meetup That CouldThe Little Meetup That Could
The Little Meetup That CouldGreat Wide Open
 
Lightning Talk - 5 Hacks to Getting the Job of Your Dreams
Lightning Talk - 5 Hacks to Getting the Job of Your DreamsLightning Talk - 5 Hacks to Getting the Job of Your Dreams
Lightning Talk - 5 Hacks to Getting the Job of Your DreamsGreat Wide Open
 
Breaking Free from Proprietary Gravitational Pull
Breaking Free from Proprietary Gravitational PullBreaking Free from Proprietary Gravitational Pull
Breaking Free from Proprietary Gravitational PullGreat Wide Open
 
Dealing with Unstructured Data: Scaling to Infinity
Dealing with Unstructured Data: Scaling to InfinityDealing with Unstructured Data: Scaling to Infinity
Dealing with Unstructured Data: Scaling to InfinityGreat Wide Open
 
You Don't Know Node: Quick Intro to 6 Core Features
You Don't Know Node: Quick Intro to 6 Core FeaturesYou Don't Know Node: Quick Intro to 6 Core Features
You Don't Know Node: Quick Intro to 6 Core FeaturesGreat Wide Open
 
Using Cryptography Properly in Applications
Using Cryptography Properly in ApplicationsUsing Cryptography Properly in Applications
Using Cryptography Properly in ApplicationsGreat Wide Open
 
Lightning Talk - Getting Students Involved In Open Source
Lightning Talk - Getting Students Involved In Open SourceLightning Talk - Getting Students Involved In Open Source
Lightning Talk - Getting Students Involved In Open SourceGreat Wide Open
 
You have Selenium... Now what?
You have Selenium... Now what?You have Selenium... Now what?
You have Selenium... Now what?Great Wide Open
 
How Constraints Cultivate Growth
How Constraints Cultivate GrowthHow Constraints Cultivate Growth
How Constraints Cultivate GrowthGreat Wide Open
 
Troubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingTroubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingGreat Wide Open
 
The Current Messaging Landscape
The Current Messaging LandscapeThe Current Messaging Landscape
The Current Messaging LandscapeGreat Wide Open
 
Understanding Open Source Class 101
Understanding Open Source Class 101Understanding Open Source Class 101
Understanding Open Source Class 101Great Wide Open
 
Elasticsearch for SQL Users
Elasticsearch for SQL UsersElasticsearch for SQL Users
Elasticsearch for SQL UsersGreat Wide Open
 

Más de Great Wide Open (20)

The Little Meetup That Could
The Little Meetup That CouldThe Little Meetup That Could
The Little Meetup That Could
 
Lightning Talk - 5 Hacks to Getting the Job of Your Dreams
Lightning Talk - 5 Hacks to Getting the Job of Your DreamsLightning Talk - 5 Hacks to Getting the Job of Your Dreams
Lightning Talk - 5 Hacks to Getting the Job of Your Dreams
 
Breaking Free from Proprietary Gravitational Pull
Breaking Free from Proprietary Gravitational PullBreaking Free from Proprietary Gravitational Pull
Breaking Free from Proprietary Gravitational Pull
 
Dealing with Unstructured Data: Scaling to Infinity
Dealing with Unstructured Data: Scaling to InfinityDealing with Unstructured Data: Scaling to Infinity
Dealing with Unstructured Data: Scaling to Infinity
 
You Don't Know Node: Quick Intro to 6 Core Features
You Don't Know Node: Quick Intro to 6 Core FeaturesYou Don't Know Node: Quick Intro to 6 Core Features
You Don't Know Node: Quick Intro to 6 Core Features
 
Hidden Features in HTTP
Hidden Features in HTTPHidden Features in HTTP
Hidden Features in HTTP
 
Using Cryptography Properly in Applications
Using Cryptography Properly in ApplicationsUsing Cryptography Properly in Applications
Using Cryptography Properly in Applications
 
Lightning Talk - Getting Students Involved In Open Source
Lightning Talk - Getting Students Involved In Open SourceLightning Talk - Getting Students Involved In Open Source
Lightning Talk - Getting Students Involved In Open Source
 
You have Selenium... Now what?
You have Selenium... Now what?You have Selenium... Now what?
You have Selenium... Now what?
 
How Constraints Cultivate Growth
How Constraints Cultivate GrowthHow Constraints Cultivate Growth
How Constraints Cultivate Growth
 
Inner Source 101
Inner Source 101Inner Source 101
Inner Source 101
 
Running MySQL on Linux
Running MySQL on LinuxRunning MySQL on Linux
Running MySQL on Linux
 
Search is the new UI
Search is the new UISearch is the new UI
Search is the new UI
 
Troubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingTroubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed Debugging
 
The Current Messaging Landscape
The Current Messaging LandscapeThe Current Messaging Landscape
The Current Messaging Landscape
 
Apache httpd v2.4
Apache httpd v2.4Apache httpd v2.4
Apache httpd v2.4
 
Understanding Open Source Class 101
Understanding Open Source Class 101Understanding Open Source Class 101
Understanding Open Source Class 101
 
Thinking in Git
Thinking in GitThinking in Git
Thinking in Git
 
Antifragile Design
Antifragile DesignAntifragile Design
Antifragile Design
 
Elasticsearch for SQL Users
Elasticsearch for SQL UsersElasticsearch for SQL Users
Elasticsearch for SQL Users
 

Último

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 

Architecture of a Next-Generation Parallel File System

  • 1. Architecture  of  a     Next-­‐Generation  Parallel  File  System  
  • 2. -­‐  Introduction   -­‐  Whats  in  the  code  now   -­‐  futures   Agenda  
  • 4. What  is  OrangeFS?   •  OrangeFS  is  a  next  generation  Parallel  File  System   •  Based  on  PVFS   •  Distributes  file  data  across  multiple  file  servers   leveraging  any  block  level  file  system.   •  Distributed  Meta  Data  across  1  to  all  storage  servers     •  Supports  simultaneous  access  by  multiple  clients,   including  Windows  using  the  PVFS  protocol  Directly   •  Works  w/  standard  kernel  releases  and  does  not  require   custom  kernel  patches   •  Easy  to  install  and  maintain  
  • 5. Why  Parallel  File  System?   HPC  –  Data  Intensive   Parallel  (PVFS)  Protocol   •  Large  datasets   •  Checkpointing   •  Visualization   •  Video   •  BigData     Unstructured  Data  Silos   Interfaces  to  Match  Problems   •  Unify  Dispersed  File  Systems   •  Simplify  Storage  Leveling   §  Multidimensional  arrays   §   typed  data     §  portable  formats  
  • 6. Original  PVFS  Design  Goals   §  Scalable   §  Configurable  file  striping   §  Non-­‐contiguous  I/O  patterns   §  Eliminates  bottlenecks  in  I/O  path   §  Does  not  need  locks  for  metadata  ops   §  Does  not  need  locks  for  non-­‐conflicting  applications   §  Usability   §  Very  easy  to  install,  small  VFS  kernel  driver   §  Modular  design  for  disk,  network,  etc   §  Easy  to  extend  -­‐>  Hundreds  of  Research  Projects  have   used  it,  including  dissertations,  thesis,  etc…  
  • 7. OrangeFS  Philosophy     •  Focus  on  a  Broader  Set  of  Applications   •  Customer  &  Community  Focused   •  (>300  Member  Strong  Community  &  Growing)   •  Open  Source   •  Commercially  Viable   •  Enable  Research    
  • 9. System  Architecture   •  OrangeFS  servers  manage  objects   •  Objects  map  to  a  specific  server   •  Objects  store  data  or  metadata   •  Request  protocol  specifies  operations  on  one  or   more  objects     •  OrangeFS  object  implementation   •  DB  for  indexing  key/value  data   •  Local  block  file  system  for  data  stream  of  bytes  
  • 11. 1994-­‐2004   Design  and  Development  at   CU  Dr.  Ligon  +  ANL  (CU   Graduates)   Primary  Maint  &  Development   ANL  (CU  Graduates)  +  Community   2004-­‐2010   2007-­‐2010   New  PVFS  Branch   SC10  (fall  2010)   2015   Announced  with  community  and  is  now     Mainline  of  future  development  as  of  2.8.4     Spring  2012   New  Development  focused  on   a  broader  set  of  problems   SC11  (fall  2011)   Performance  improvements,     Direct  Lib  +  Cache   Stability,  WebDAV,  S3   PVFS2   PVFS   Improved  MD,  Stability,   Server  Side  Operations,   Newer  Kernels,  Testing   Windows  Client,  Stability,   Replicate  on  Immutable   2.8.6  +  Webpack   2.8.5  +  Win   Support  and  Targeted  Development  Services   Initially  Offered  by  Omnibond   OrangeFS    3.0   Summer  2014   Distributed  Dir  MD,   Capability  based  security  2.9.0   Winter  2013   Performance  improvements,     Stability,  2.8.7  +  Webpack   Spring  2014   Performance  improvements,  Stability,  shared  mmap,  multi  TCP/IP  Server   Homing,  Hadoop    MapReduce,  user  lib  fixes,  new  spec  file  for  RPMS  +  DKMS  2.8.8  +  Webpack   Available  in  the  AWS  Marketplace   Replicated  MD,  File  Data,  128  bit  UUID  for  File  Handles,  Parallel  Background   Processes,  web  based  Mgt  Ui,  self  healing  processes,  data  balancing  
  • 12. In  the  Code  now  
  • 13. Server  to  Server  Communications  (2.8.5)     Traditional  Metadata   Operation    Create  request  causes  client  to   communicate  with  all  servers  O(p)   Scalable  Metadata   Operation     Create  request  communicates  with  a  single   server  which  in  turn  communicates  with   other  servers  using  a  tree-­‐based  protocol   O(log  p)   Mid   Client   Serv   App   Mid  Mid   Client   Serv   Client   Serv   App   Network   App   Mid   Client   Serv   App   Mid  Mid   Client   Serv   Client   Serv   App   Network   App  
  • 14. Recent  Additions  (2.8.5)   SSD  Metadata  Storage   Replicate  on  Immutable  (file  based)   Windows  Client   Supports  Windows  32/64  bit   Server  2008,  R2,  Vista,  7  
  • 15. Direct  Access  Interface    (2.8.6)   •  Implements:   •  POSIX  system  calls   •  Stdio  library  calls   •  Parallel  extensions   •  Noncontiguous  I/O   •  Non-­‐blocking  I/O   •  MPI-­‐IO  library   •  Found  more  boundary   conditions  fixed  in   upcoming  2.8.7   App   Kernel   PVFS  lib   Client  Core   Direct  lib   PVFS  lib   Kernel   App   IB   TCP  
  • 16. File  System  File  System   File  System   Direct  Interface  Client  Caching    (2.8.6)     •  Direct  Interface  enables  Multi-­‐Process   Coherent  Client  Caching  for  a  single  client   File  System   File  System   Client  Application   Direct  Interface     Client  Cache   File  System  
  • 17. WebDAV  (2.8.6  webpack)   PVFS  Protocol   OrangeFS   Apache   •  Supports  DAV  protocol  and  tested  with  the  Litmus  DAV  test   suite   •  Supports  DAV  cooperative  locking  in  metadata    
  • 18. S3  (2.8.6  webpack)   PVFS  Protocol   OrangeFS   Apache   •  Tested  using  s3cmd  client   •  Files  accessible  via  other  access  methods   •  Containers  are  Directories   •  Accounting  Pieces  not  implimented  
  • 19. Summary  -­‐  Recently  Added  to  OrangeFS   •  In  2.8.3   •  Server-­‐to-­‐Server  Communication   •  SSD  Metadata  Storage   •  Replicate  on  Immutable   •  2.8.4,  2.8.5  (fixes,  support  for  newer  kernels)   •  Windows  Client   •  2.8.6  –  Performance,  Fixes,  IB  updates   •  Direct  Access  Libraries  (initial  release)   •  preload  library  for  applications,  Including  Optional  Client   Cache   •  Webpack   •  WebDAV  (with  file  locking),  S3  
  • 20. Available  on  the    Amazon  AWS  Marketplace  and  brought  to  you  by  Omnibond   OrangeFS   Instance   Unified High Performance File System DynamoDB   EBS   Volumes   OrangeFS  on  AWS  Marketplace  
  • 21. In  2.8.8  (Just  Released)  
  • 22. Hadoop  JNI  Interface  (2.8.8)   •  OrangeFS  Java  Native   Interface   •  Extension  of  Hadoop  File   System  Class  –>JNI   •  Buffering   •  Distribution   •  Fast  PVFS  Protocol   for  Remote   Configuration   PVFS  Protocol  
  • 23. Additional  Items(2.8.8)   •  Updated  user  lib   •  Shared  mmap  support  in  kernel  module   •  Support  for  kernels  up  to  3.11   •  Multi-­‐homing  servers  over  IP   •  Clients  can  access  server  over  multiple  interfaces  (say   clients  on  IPoIB  +clients  on  IPoEthernet  +clients  on  IPoMx   •  Enterprise  Installers  (Coming  Shortly)   •  Client  (with  DKMS  for  Kernel  Module)   •  Server     •  Devel  
  • 25. Scaling  Tests   16  Storage  Servers  with  2  LVM’d  5+1  RAID  sets  were  tested  with  up   to  32  clients,  with  read  performance  reaching  nearly  12GB/s  and   write  performance  reaching  nearly  8GB/s.  
  • 26. MapReduce  over  OrangeFS   •  8  Dell  R720  Servers  Connected  with  10Gb/s  Ethernet   •  Remote  Case  adds  an  additional  8  Identical  Servers  and   does  all  OrangeFS  work  Remotely  and  only  Local  work  is   done  on  Compute  Node  (Traditional  HPC  Model)   •  *25%  improvement  with  OrangeFS  running  Remotely    
  • 27. MapReduce  over  OrangeFS   •  8  Dell  R720  Servers  Connected  with  10Gb/s  Ethernet   •  Remote  Clients  are  R720s  with  single  SAS  disks  for  local   data  (vs.  12  disk  arrays  in  the  previous  test).  
  • 28. OrangeFS   Clients   SC13  Demo  Overview     OrangeFS   Clients   16  Dell  R720   OrangeFS   Servers   SC13  Floor   •  Clemson   •  USC   •  I2   •  Omnibond   I2  Innovation   Platform   100Gb/s  
  • 29. Sc13  WAN  Performance   Multiple  Concurrent  Client  File  Creates  over  PVFS  protocol  (nullio)  
  • 30. For  2.9  (summer  2014)  
  • 31. Distributed  Directory  Metadata  (2.9.0)   DirEnt1   DirEnt2   DirEnt3   DirEnt4   DirEnt5   DirEnt6   DirEnt1   DirEnt5   DirEnt3   DirEnt2   DirEnt6   DirEnt4   Server0   Server1   Server2   Server3   Extensible Hashing   u  State  Management  based  on  Giga+   u  Garth  Gibson,  CMU   u  Improves  access  times  for  directories  with   a  very  large  number  of  entries  
  • 32. Cert  or  Credential   Signed  Capability  I/O   Signed  Capability   Signed  Capability  I/O   Signed  Capability  I/O   OpenSSL   PKI   •  3  Security  Modes   •  Basic  –  OrangeFS/PVFS  Classic  Mode   •  Key-­‐Based  –  Keys  are  used  to  authorize  clients  for  use  with  the  FS   •  User  Certificate  Based  with  LDAP  –  user  certs  are  used  for  access  to   the  file  system  and  are  generated  based  on  LDAP  uid/gid  info  
  • 34. Replication  /  Redundancy    •  Redundant  Metadata   •  seamless  recovery  after  a  failure   •  Redundant  objects  from  root  directory  down   •  Configurable   •  Redundant  Data                    Update  mode  (real  time,  on  close,  on  immutable,  none)                    Configurable  Number  of  Replicas   •  Real  Time  “forked  flow”  work  shows  little  overhead   •  Replicate  on  Close   •  Replicate  to  external  (like  LTFS)   •  Looking  at  supporting  HSM  option  to  external            (no  local  replica)   •  Emphasis  on  continuous  operation   OrangeFS  3.0  
  • 35. •  An  OID  (object  identifier)  is  a  128-­‐bit  UUID  that   is  unique  to  the  data-­‐space   •  An  SID  (server  identifier)  is  a  128-­‐bit  UUID  that   is  unique  to  each  server.   •  No  more  than  one  copy  of  a  given  data-­‐space   can  exist  on  any  server   •  The  (OID,  SID)  tuple  is  unique  within  the  file   system.   •  (OID,  SID1),    (OID,  SID2),  (OID,  SID3)  are  copies   of  the  object  identifier  on  different  servers.   Handles  -­‐>  UUIDs   OrangeFS  3.0  
  • 36. •  In  an  Exascale  environment  with  the  potential  for     thousands  of  I/O  servers,  it  will  no  longer  be  feasible   for  each  server  to  know  about  all  other  servers.   •  Servers  Discovery   •  Will  know  a  subset  of  their  neighbors  at  startup  (or  may   be  cached  from  previous  startups).      Similar  to  DNS   domains.   •  Servers  will  learn  about  unknown  servers  on  an  as  needed   basis  and  cache  them.    Similar  to  DNS  query  mechanisms   (root  servers,  authoritative  domain  servers).   •  SID  Cache,  in  memory  db  to  store  server  attributes   Server  Location  /  SID  Mgt   OrangeFS  3.0  
  • 37. Policy  Based  Location   •  User  defined  attributes  for  servers  and   clients   •  Stored  in  SID  cache   •  Policy  is  used  for  data  location,  replication   location  and  multi-­‐tenant  support     •  Completely  Flexible   •  Rack   •  Row   •  App   •  Region     OrangeFS  3.0  
  • 38. •  Modular  infrastructure  to  easily  build   background  parallel  processes  for  the  file   system   Used  for:   •    Gathering  Stats  for  Monitoring   •  Usage  Calculation  (can  be  leveraged  for  Directory   Space  Restrictions,  chargebacks)   •  Background  safe  FSCK  processing  (can  mark  bad   items  in  MD)   •  Background  Checksum  comparisons     •  Etc…     Background  Parallel  Processing  Infrastructure   (3.0)  
  • 39. Admin  REST  interface  /  Admin  UI     PVFS  Protocol   OrangeFS   Apache   REST   (3.0)  
  • 40. Data  Migration  /  Mgt   •  Built  on  Redundancy  &  DBG  processes   •  Migrate  objects  between  servers   •  De-­‐populate  a  server  going  out  of  service   •  Populate  a  newly  activated  server  (HW  lifecycle)   •  Moving  computation  to  data   •  Hierarchical  storage   •  Use  existing  metadata  services   •  Possible  -­‐  Directory  Hierarchy  Cloning   •   Copy  on  Write  (Dev,  QA,  Prod  environments  with  high  %  data   overlap)   OrangeFS  3.x  
  • 41. Hierarchical  Data  Management   Archive   Intermediate   Storage   NFS   Remote   Systems   exceed,  OSG,   Lustre,  GPFS,     Ceph,  Gluster   HPC   OrangeFS   Metadata   OrangeFS   Users   OrangeFS  3.x  
  • 42. Attribute  Based  Metadata  Search     •  Client  tags  files  with  Keys/Values     •   Keys/Values  indexed  on  Metadata  Servers   •   Clients  query  for  files  based  on  Keys/Values   •   Returns  file  handles  with  options  for  filename   and  path   Key/Value  Parallel  Query   Data   Data   File  Access   OrangeFS  3.x  
  • 44. Extend  Capability  based  security   •  Enable  certificate  level  access  (in  process)   •  Federated  access  capable   •  Can  be  integrated  with  rules  based  access   control   •  Department  x  in  company  y  can  share  with   Department  q  in  company  z     •  rules  and  roles  establish  the  relationship   •  Each  company  manages  their  own  control  of  who  is  in   the  company  and  in  department  
  • 45. SDN  -­‐  OpenFlow   •  Working  with  OF  research  team  at  CU   •  OF  separates  the  control  plane  from  delivery,   gives  ability  to  control  network  with  SW   •  Looking  at  bandwidth  optimization   leveraging  OF  and  OrangeFS  
  • 46. ParalleX   ParalleX  is  a  new  parallel  execution  model   •  Key  components  are:   •  Asynchronous  Global  Address  Space  (AGAS)   •  Threads   •  Parcels  (message  driven  instead  of  message  passing)   •  Locality     •  Percolation     •  Synchronization  primitives   •  High  Performance  ParalleX  (HPX)  library   implementation  written  in  C++  
  • 47. PXFS   •  Parallel  I/O  for  ParalleX  based  on  PVFS   •  Common  themes  with  OrangeFS  Next   •  Primary  objective:  unification  of  ParalleX  and  storage   name  spaces.   •  Integration  of  AGAS  and  storage  metadata  subsystems   •  Persistent  object  model   •  Extends  ParalleX  with  a  number  of  IO  concepts   •  Replication   •  Metadata   •  Extending  IO  with  ParalleX  concepts   •  Moving  work  to  data   •  Local  synchronization   •  Effort  with  LSU,  Clemson,  and  Indiana  U.   •  Walt  Ligon,  Thomas  Sterling  
  • 49. Johns  Hopkins  OrangeFS  Selection   •  JHU  -­‐  HLTCOE  Selected  OrangeFS   •  After  evaluating:  Ceph,  GlusterFS,  Lustre  and   OrangeFS   “Leveraging  OrangeFS  for  the   parallel  filesystem,  the  system  as  a  whole  is  capable   of  delivering  30GB/s  write,  46GB/s  read,  and  be-­‐   tween  37,260-­‐237,180  IOPS  of  performance.  The   variation  in  IOPS  performance  is  dependent  on  the   file  size  and  number  of  bytes  written  per  commit  as   documented  in  the  Test  Results  section.”*   *  http://hltcoe.jhu.edu/uploads/publications/papers/14662_slides.pdf   “The  final  system  design  rep-­‐   resents  a  2,775%  increase  in  read  performance  and   a  1,763-­‐11,759%  increase  in  IOPS”*  
  • 50. Learning  More   •  www.orangefs.org  web  site   •  Releases   •  Documentation   •  Wiki   •  pvfs2-­‐users@beowulf-­‐underground.org   •  Support  for  users   •  pvfs2-­‐developers@beowulf-­‐underground.org   •  Support  for  developers  
  • 51. Support  &  Development  Services   •  www.orangefs.com  &  www.omnibond.com   •  Professional  Support  &  Development  team   •  Buy  into  the  project    
  • 52.   Intelligent   Transportation   Solutions     Identity  Manager   Drivers  &  Sentinel   Connectors       Parallel  Scale-­‐Out   Storage  Software     Social  Media   Interaction  System   Omnibond  Info   Computer  Vision   Enterprise   Personal   Solution  Areas