SlideShare una empresa de Scribd logo
1 de 44
Descargar para leer sin conexión
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                 O’Reilly Open Source Convention 2008




          Ruby Track: Session 2471

 Real-time Computer Vision With Ruby

                J. Wedekind

         Wednesday, July 23rd 2008


       Nanorobotics EPSRC Basic Technology Grant

       Microsystems and Machine Vision Laboratory
       Modelling Research Centre




1/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                           Introduction
                            In This Talk




Brain.eval <<REQUIRED
  require ’RMagick’
  require ’Qt4’
  require ’complex’
  require ’matrix’
  require ’narray’
REQUIRED

 2/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                        Introduction
                  EU Esprit MINIMAN Project




3/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                         Introduction
                    EU IST MiCRoN Project




4/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                       Introduction
                 EPSRC Nanorobotics Project




                                    Electron Microscopy
                                   • telemanipulation
                                   • drift-compensation
                                   • closed-loop control
                                     Computer Vision
                                   • real-time software
                                   • system integration
                                   • theoretical insights



5/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                         Introduction
                      Industrial Robotics


                             Default Situation
                   • proprietary operating system
                   • proprietary robot software
                   • proprietary process simulation software
                   • proprietary mathematics software
                   • proprietary machine vision software
                   • proprietary manufacturing software
                        Total Cost of Lock-in (TCL)
                   • duplication of work
                   • integration problems
                   • lack of progress
                   • handicapped developers


6/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                         Introduction
                Innovation Happens Elsewhere



                 Ron Goldman & Richard P. Gabriel

           “The market need is greatest for platform
           products because of the importance of a
           reliable promise that vendor lock-in will not
           endanger the survival of products built or
           modified on the software stack above that
           platform.”


           “It is important to remove as many barriers to
           collaboration as possible: social, political,
           and technical.”




7/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                   Design Considerations
            HornetsEye’s Distinguishing Features




       • GPL
       • Ruby
       • Real-Time




8/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                 Design Considerations
                                                        GPLv3

Four Freedoms (Richard Stallman)
 1. The freedom to run the program, for any purpose.
 2. The freedom to study how the program works, and adapt it to your
    needs.
 3. The freedom to redistribute copies so you can help your neighbor.
 4. The freedom to improve the program, and release your
    improvements to the public, so that the whole community benefits.
Respect The Freedom Of Downstream Users (Richard Stallman)
GPL requires derived works to be available under the same license.
Covenant Not To Assert Patent Claims (Eben Moglen)
GPLv3 deters users of the program from instituting patent ligitation by
the threat of withdrawing further rights to use the program.
Other (Eben Moglen)
GPLv3 has regulations against DMCA restrictions and tivoization.


                   9/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                          Design Considerations
                                                                  Ruby


# ---------------------------------------------------------------------------------------------------------
img = Magick::Image.read( "circle.png" )[ 0 ]
str = img.export_pixels_to_str( 0, 0, img.columns, img.rows, "I", Magick::CharPixel )
arr = NArray.to_na( str, NArray::BYTE, img.columns, img.rows )
puts ( arr / 128 ).inspect
# NArray.byte(20,20):
# [ [ 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1 ],
# [ 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1 ],
# [ 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1 ],
# [ 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1 ],
# [ 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1 ],
# [ 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1 ],
# [ 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1 ],
# [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ],
# [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ],
# ...
# ---------------------------------------------------------------------------------------------------------

                  No high-level code in C++!

                  10/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                         Design Considerations
                               Real-Time



        Real-time Object Recognition




11/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                              HornetsEye Core
                                                              Compact Storage


# ------------------------------------------------------------------------------------------------------
class Sequence
 Type = Struct.new( :name, :type, :size, :default, :pack, :unpack ); @@types = []
 def Sequence.register_type( sym, type, size, default, pack, unpack )
   eval "#{sym.to_s} = Type.new( sym.to_s, type, size, default, pack, unpack )"
 end
 register_type( :OBJECT, Object, 1, nil, proc { |o| [o] }, proc { |s| s[0] } )
 register_type( :UBYTE, Fixnum, 1, 0, proc { |o| [o].pack("C") },
    proc { |s| s.unpack("C")[0] } )
 def initialize( type = OBJECT, n = 0, value = nil )
   @type, @data = type, type.pack.call( value == nil ? type.default : value ) * n
   @size = n
 end
 def []( i )
   p = i * @type.size; @type.unpack.call( @data[ p...( p + @type.size ) ] )
 end
 def []=( i, o )
   p = i * @type.size; @data[ p...( p + @type.size ) ] = @type.pack.call( o ); o
 end
end
# ------------------------------------------------------------------------------------------------------


               12/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                            HornetsEye Core
                                                          N-Dimensional Arrays



# ------------------------------------------------------------------------------------------------------
class MultiArray
 UBYTE = Sequence::UBYTE
 OBJECT = Sequence::OBJECT
 def initialize( type = OBJECT, *shape )
   @shape = shape
   stride = 1
   @strides = shape.collect { |s| old = stride; stride *= s; old }
   @data = Sequence.new( type, shape.inject( 1 ) { |r,d| r*d } )
 end
 def []( *indices )
   @data[ indices.zip( @strides ).inject( 0 ) { |p,i| p + i[0] * i[1] } ]
 end
 def []=( *indices )
   value = indices.pop
   @data[ indices.zip( @strides ).inject( 0 ) { |p,i| p + i[0] * i[1] } ] = value
 end
end
# ------------------------------------------------------------------------------------------------------




               13/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                          HornetsEye Core
                                                       Element-wise Operations


# ------------------------------------------------------------------------------------------------------
class Sequence
 attr_reader :type, :data, :size
 def collect( type = @type )
   retval = Sequence.new( type, @size )
   ( 0...@size ).each { |i| retval[i] = yield self[i] }
   retval
 end
end
class MultiArray
 attr_accessor :shape, :strides, :data
 def MultiArray.import( type, data, *shape )
   retval = MultiArray.new( type )
   stride = 1; retval.strides = shape.collect { |s| old = stride; stride *= s; old }
   retval.shape, retval.data = shape, data; retval
 end
 def collect( type = @data.type, &action )
   MultiArray.import( type, @data.collect( type, &action ), *@shape )
 end
end
# ------------------------------------------------------------------------------------------------------


               14/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                           HornetsEye Core
                                                         Return-type Coercions



# ------------------------------------------------------------------------------------------------------
class Sequence
 @@coercions = Hash.new
 @@coercions.default = OBJECT
 def Sequence.register_coercion( result, type1, type2 )
   @@coercions[ [ type1, type1 ] ] = type1
   @@coercions[ [ type2, type2 ] ] = type2
   @@coercions[ [ type1, type2 ] ] = result
   @@coercions[ [ type2, type1 ] ] = result
 end
 register_coercion( OBJECT, OBJECT, UBYTE )
 def +( other )
   retval = Sequence.new( @@coercions[ [ @type, other.type ] ], @size )
   ( 0...@size ).each { |i| retval[i] = self[i] + other[i] }
   retval
 end
end
# ------------------------------------------------------------------------------------------------------




               15/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                             HornetsEye Core
                                            Fast Malloc Objects



VALUE Malloc::wrapMid( VALUE rbSelf, VALUE rbOffset, VALUE rbLength )
{
  char *self; Data_Get_Struct( rbSelf, char, self );
  return rb_str_new( self + NUM2INT( rbOffset ), NUM2INT( rbLength ) );
}



m=Malloc.new(1000)
m.mid(10,4)
# "000000000000"
m.assign(10,"test")
m.mid(10,4)
# "test"




                  16/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                                       HornetsEye Core
                                                                       Native Operations

g,h ∈ {0, 1,  . .  w − 1} × {0, 1, . . . , h − 1} → R
             . ,
   x1 
   ) = g( x1 )/2
                                                                             MultiArray.dfloat( 320, 240 ):
   
h( 
   
               
               
               
                                                       MultiArray.dfloat        [ [ 245.0, 244.0, 197.0, ... ],
   
  x          
              x 
    2              2                                     Fixnum                    [ 245.0, 247.0, 197.0, ... ],
                                                                                   [ 247.0, 248.0, 187.0, ... ]
Ruby                                              h=g/2                          ...

                                                                                           [3.141]
     MultiArray.respond to?( ”binary div lint dfloat” )
                                      no




                                                                                                        String.unpack(”D”)
                                                                                      Array.pack(”D”)
     h = g.collect { |x| x / 2 }                         yes



C++                                        for ( int i=0; i<n; i++ )
                                              *r++ = *p++ / q;
     MultiArray.binary   div   byte   byte
     MultiArray.binary   div   byte   bytergb                          ”x54xE3xA5x9BxC4x20x09x40”
     MultiArray.binary   div   byte   dcomplex
     MultiArray.binary   div   byte   dfloat
     MultiArray.binary   div   byte   dfloatrgb
     ...



                                      17/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                HornetsEye I/O
                           Colourspace Conversions




                                         
 Y   0.299
  
                   0.587     0.114 
                                      
                                         R  0 
                                             
                                             
  
                                   
                                            
                                             
 =
  
  
Cb  −0.168736 −0.331264
                                      
                                          + 
                                             
                                             
                                0.500    G 128
                                   
                                            
  
  
                                   
                                      
                                            
                                             
                                             
  
                                   
                                            
                                             
C   0.500
   r               −0.418688 −0.081312    B  128
           also see: http://fourcc.org/

   18/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                             HornetsEye I/O
                                 BMP, GIF, JPEG, PPM, PNG, PNM, TIFF, ...




mgk = Magick::Image.read( "circle.png" )[0] # code is simplified
str = magick.export_pixels_to_str( 0, 0, mgk.columns, mgk.rows, "RGB",
                                   Magick::CharPixel )
arr = MultiArray.import( MultiArray::UBYTERGB, str, mgk.columns, mgk.rows )

                                      ⇓

arr = MultiArray.load_rgb24( "circle.png" )
mgk = Magick::Image.new( *arr.shape ) { |x| x.depth = 8 }
mgk.import_pixels( 0, 0, arr.shape[0], arr.shape[1], "RGB", arr.to_s,
                   Magick::CharPixel )
Magick::ImageList.new.push( mgk ).write( "circle.png" )

                                      ⇓

arr = MultiArray.save_rgb24( "circle.png" )

                  19/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                              HornetsEye I/O
                                              Video Decoding




xine_t *m_xine = xine_new(); // code is simplified
xine_config_load( m_xine, "/home/myusername/.xine/config" );
xine_init( m_xine );
xine_video_port_t *m_videoPort = xine_new_framegrab_video_port( m_xine );
xine_stream_t *m_stream = xine_stream_new( m_xine, NULL, m_videoPort );
xine_open( m_stream, "test.avi" );
xine_video_frame_t *m_frame;
xine_get_next_video_frame( m_videoPort, &m_frame );
xine_free_video_frame( m_videoPort, &m_frame );
                                      ⇓
xine = XineInput.new( "test.avi" )
img = xine.read
                  20/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                              HornetsEye I/O
                                              Video Encoding




// "const unsigned char *data" points to I420-data of 320x240 frame
FILE *m_control = popen( "mencoder - -o test.avi"   // code is simplified
                         " -ovc lavc -lavcopts vcodec=ffv1", "w" );
fprintf( m_control, "YUV4MPEG2 W320 H240 F25000000:1000000 Ip A0:0n" );
fprintf( m_control, "FRAMEn" );
fwrite( data, 320 * 240 * 3 / 2, 1, m_control );

                                      ⇓

# "img" is of type "HornetsEye::Image" or "HornetsEye::MultiArray"
mencoder = MEncoderOutput.new( "test.avi", 25 )
mencoder.write( img )

                  21/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                HornetsEye I/O
                                          Video For Linux (V4L/V4L2)




int m_fd = open( "/dev/video0", O_RDWR, 0 );    // code is incomplete
ioctl( VIDIOC_S_FMT, &m_format );
ioctl( VIDIOC_REQBUFS, &m_req );
ioctl( VIDIOC_QUERYBUF, &m_buf[0] );
ioctl( VIDIOC_QBUF, &m_buf[0] );
ioctl( VIDIOC_STREAMON, &type );
ioctl( VIDIOC_DQBUF, &buf )
// ...
                                      ⇓
v4l2 = V4L2Input.new
img = v4l2.read
                  22/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                              HornetsEye I/O
                                          IIDC/DCAM (libdc1394)




raw1394handle_t m_handle = dc1394_create_handle( 0 ); // code is incomplete
dc1394_cameracapture m_camera; int numCameras;
nodeid_t *m_cameraNode = dc1394_get_camera_nodes( m_handle, &numCameras, 0 );
dc1394_camera_on( m_handle, 0 );
dc1394_dma_setup_capture( m_handle, m_cameraNode[ 0 ], 0,
                          FORMAT_VGA_NONCOMPRESSED, MODE_640x480_YUV422,
                          FRAMERATE_15, 4, 1, NULL, &m_camera );
dc1394_start_iso_transmission( m_handle, m_camera.node );
// ...

                                      ⇓

firewire = DC1394Input.new
img = firewire.read

                  23/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                        HornetsEye I/O
                                                    XPutImage/glDrawPixels




# ----------------------------------------------------------
img = MultiArray.load_rgb24( "howden.jpg" )
display = X11Display.new
output = XImageOutput.new
# output = OpenGLOutput.new
window = X11Window.new( display, output,
                                     320, 240 )
window.title = "Test"
output.write( img )
window.show
display.eventLoop
# ----------------------------------------------------------

                    24/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                          HornetsEye I/O
                                                           XvPutImage

# -----------------------------------------------------
xine = XineInput.new( "dvd://1" ); sleep 2
display = X11Display.new
output = XVideoOutput.new
window = X11Window.new( display, output,
                                 768, 576 * 9 / 16 )
window.title = "Test"
window.show
delay = xine.frame_duration.to_f / 90000.0
time = Time.now
while xine.status? and output.status?
  output.write( xine.read )
  time_left = delay - ( Time.now.to_f -
                  time.to_f )
  display.eventLoop( time_left * 1000 )
  time += delay
end
# -----------------------------------------------------

                        25/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                         HornetsEye I/O
                                   High Dynamic Range Images

Exposure Series           Alignment (Hugin)        Tonemapping (QtPfsGui)



                                               ⇒



                                                     Loading And Saving
                     ⇒




                                                   img = MultiArray.
                                                     load_rgbf("test.exr")
                                                   img.
                                                     save_rgbf("test.exr")


             26/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                        HornetsEye I/O
                   Qt4-QtRuby: G++ Dataflow




27/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                         HornetsEye I/O
                   Qt4-QtRuby: Ruby Dataflow




28/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                                                       HornetsEye I/O
                                                                                Qt4-QtRuby: XVideo Integration

# ---------------------------------------------------------------------------
class VideoPlayer < Qt::Widget
 def initialize
   super
   @xvideo = Hornetseye::XvWidget.new( self )
   layout = Qt::VBoxLayout.new( self )
   layout.addWidget( @xvideo )
   @xine = Hornetseye::XineInput.new( "test.avi", false )
   @timer = startTimer( @xine.frame_duration * 1000 / 90000 )
   resize( 640, 400 )
 end
 def timerEvent( e )
   begin
     if @xine
       img = @xine.read
       @xvideo.write( img )
     end
   rescue
     @xine = nil
     killTimer( @timer )
     @xvideo.clear
     @timer = 0
   end
 end
end
app = Qt::Application.new( ARGV )
VideoPlayer.new.show
app.exec
# ---------------------------------------------------------------------------


                                  29/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                 HornetsEye I/O
                               Microsoft Windows


            /

        V4LInput           VFWInput
        V4L2Input          DShowInput
        DC1394Input        —
        XineInput          —
        MPlayerInput       MPlayerInput
        MEncoderOutput     MEncoderOutput
        X11Display         W32Display
        X11Window          W32Window
        XImageOutput       GDIOutput
        OpenGLOutput       —
        XVideoOutput       —


30/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                            Introduction
                           From Here On




Brain.eval <<REQUIRED
  require ’hornetseye’
  include Hornetseye
REQUIRED




 31/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                                     Computer Vision With Ruby
                                                                       Look-Up-Tables (LUTs)

g ∈ {0, 1, . . . , w} × {0, 1, . . . , h} → {0, 1, . . . , 255}
m ∈ {0, 1, . . . , 255} → {0, 1, . . . , 255}3
                   
   x1 
   ) = m g( x1 )
   
                   
                     
h( 
   
   
  x 
                     
                     
                     
                     
                    x 
      2               2


# -----------------------------------------------------------------------
img = MultiArray.load_grey8( "test.jpg" )
class Numeric
  def clip( range )
   [ [ self, range.begin ].max, range.end ].min
 end
end
colours = {}
for i in 0...256
 hue = 240 - i * 240.0 / 256.0
 colours[i] =
  RGB( ( ( hue - 180 ).abs - 60 ).clip( 0...60 ) * 255 / 60.0,
      ( 120 - ( hue - 120 ).abs ).clip( 0...60 ) * 255 / 60.0,
      ( 120 - ( hue - 240 ).abs ).clip( 0...60 ) * 255 / 60.0 )
end
img.map( colours, MultiArray::UBYTERGB, 256 ).display
# -----------------------------------------------------------------------

                                32/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                                               Computer Vision With Ruby
                                                                                  Masks And Ramps

# -----------------------------------------------------------------------
class MultiArray
 def MultiArray.ramp1( *shape )
   retval = MultiArray.new( MultiArray::LINT, *shape )
   for x in 0...shape[0]
    retval[ x, 0...shape[1] ] = x                                                        Compute Bounding Box
   end
   retval
 end
 # def MultiArray.ramp2 ...
end
input = V4LInput.new
x, y =
 MultiArray.ramp1( input.width, input.height ),
 MultiArray.ramp2( input.width, input.height )
display = X11Display.new
output = XVideoOutput.new
window = X11Window.new( display, output, 640, 480 )                              0   1   2   3   4   5
window.title = "Thresholding"                                                    0   1   2   3   4   5
window.show                                                                      0   1   2   3   4   5
while input.status? and output.status?                                           0   1   2   3   4   5
 img = input.read_grey8                                                          0   1   2   3   4   5
 mask = img.binarise_lt( 48 )                                                    0   1   2   3   4   5
 result = ( img / 4 ) * ( mask + 1 )
 if mask.sum > 0                                                             2 3 4 2 3 1 2 3 2 3
   bbox = [ x.mask( mask ).range, y.mask( mask ).range ]
   result[ *bbox ] *= 2                                                              1≤x≤4
 end
 output.write( result )
 display.processEvents
end
# -----------------------------------------------------------------------


                                         33/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                                           Computer Vision With Ruby
                                                                           Warps (Use Image As LUT)
                                                                                     
                                                                   g W( x1 )
                                                                                        x1 
g ∈ {0, 1, . . . , w − 1} × {0, 1, . . . , h − 1} → R3
                                                                      
                                                                                  if W( ) ∈ {0, 1, . . . , w − 1} × {0, 1, . . . , h − 1}
                                                                  
                                                                                     
                                                                                        
                                                            x1        
                                                                                      
h ∈ {0, 1, . . . , w − 1} × {0, 1, . . . , h − 1} → R3      ) =
                                                            
                                                                                   
                                                         h( 
                                                                      
                                                                        x             
                                                                                       x 
                                                                        2              2
W ∈ {0, 1, . . . , w − 1} × {0, 1, . . . , h − 1} → Z2
                                                           x    
                                                                  
                                                              2   
                                                                  
                                                                   0
                                                                  
                                                                                  otherwise
# ----------------------------------------------------------------------
class MultiArray
 # def MultiArray.ramp1 ...
 def MultiArray.ramp2( *shape )
   retval = MultiArray.new( MultiArray::LINT, *shape )
   for y in 0...shape[1]
    retval[ 0...shape[0], y ] = y
   end
   retval
 end
end
img = MultiArray.load_rgb24( "test.jpg" )
w, h = *img.shape; c = 0.5 * h
x, y = MultiArray.ramp1( h, h ), MultiArray.ramp2( h, h )
warp = MultiArray.new( MultiArray::LINT, h, h, 2 )
warp[ 0...h, 0...h, 0 ], warp[ 0...h, 0...h, 1 ] =
 ( ( ( x - c ).atan2( y - c ) / Math::PI + 1 ) * w / 2 - 0.5 ),
 ( ( x - c ) ** 2 + ( y - c ) ** 2 ).sqrt
img.warp_clipped( warp ).display
# ----------------------------------------------------------------------

                                34/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                                  Computer Vision With Ruby
                                                                 Affine Transform using Warps
                                                                                                     
# --------------------------------------------------------------------        x1  cos(α) − sin(α)    x1 
                                                                              ) = 
                                                                              
                                                                              
                                                                         Wα ( 
                                                                                                   
                                                                                                        
                                                                                                         
class MultiArray                                                                  
                                                                                    
                                                                                                   
                                                                                                        
                                                                                                         
                                                                              
                                                                             x     sin(α) cos(α) 
                                                                                                   
                                                                                                        
                                                                                                         
                                                                                                        x 
 def MultiArray.ramp1( *shape )                                                 2                          2
  retval = MultiArray.new( MultiArray::LINT, *shape )
  for x in 0...shape[0]
    retval[ x, 0...shape[1] ] = x
  end
  retval
 end
 # def MultiArray.ramp2 ...
end
img = MultiArray.load_rgb24( "test.jpg" )
w, h = *img.shape
v = Vector[ MultiArray.ramp1( w, h ) - w / 2,
         MultiArray.ramp2( w, h ) - h / 2 ]
angle = 30.0 * Math::PI / 180.0
m = Matrix[ [ Math::cos( angle ), -Math::sin( angle ) ],
         [ Math::sin( angle ), Math::cos( angle ) ] ]
warp = MultiArray.new( MultiArray::LINT, w, h, 2 )
warp[ 0...w, 0...h, 0 ], warp[ 0...w, 0...h, 1 ] =
  ( m * v )[0] + w / 2, ( m * v )[1] + h / 2
img.warp_clipped( warp ).display
# --------------------------------------------------------------------

                           35/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                  Computer Vision With Ruby
          Center Of Gravity And Principal Components




36/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                    Computer Vision With Ruby
                                    Linear Shift-Invariant Filters

   Input Image                Sharpen                   Gaussian Blur




Gauss-Gradient (X)       Gauss-Gradient (Y)




                 37/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                   Computer Vision With Ruby
                                    Edge- And Corner-Images

  Input Image                  Sobel                 Gauss-Gradient




Harris-Stephens         Kanade-Lucas-Tomasi




                38/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                                         Computer Vision With Ruby
                                                   (Inverse Compositional) Lucas-Kanade

given: template T , image I , previous pose p
sought: pose-change ∆p
argmin           ||T (x) − I(W p (W∆p (x)))||2 dx = (∗)
                               −1  −1
                                                                    −1   −1
  ∆p       x∈T                                                  I(W p (W∆p (x)))
(1) T (x) − I(W p (W∆p (x))) = T (W∆p (x)) − I(W p (x))
                −1  −1                           −1                 −1
                                                                I(W p (x))
                          δT          T     δW p
(2) T (W∆p (x)) ≈ T (x) +    (x)          ·      (x) · ∆p
                          δx                 δp
   (1,2)
(∗) = argmin(||H p + b||2 ) = (H T H)−1 H T b
            p
             
             h1,1 h1,2 · · ·
                                          
                                          b1                          T (x)
             
             
                              
                               
                                          
                                           
                                           
                                         
where H =  2,1 h2,2 · · ·     and b = b 
             
             
             h                
                                          
                                           
             
                              
                                          2
                                           
                                           
              .    .                     .
                          .. 
                                         
              .    .
                                         .
                             .
                                         
                .   .                        .
                                         

        δT      T  δW p
hi, j =    (xi ) ·      (xi ) , bi = T (xi ) − I(W p (xi ))
                                                   −1
        δx         δp j

           S. Baker and I. Matthew: “Lucas-Kanade 20 years on: a unifying framework”
                       http://www.ri.cmu.edu/projects/project 515.html

                            39/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                  Computer Vision With Ruby
                              3 Degrees-Of-Freedom Lucas-Kanade

                            Initialisation
p = Vector[ xshift, yshift, rotation ]
w, h, sigma = tpl.shape[0], tpl.shape[1], 5.0
x, y = xramp( w, h ), yramp( w, h )
gx = tpl.gauss_gradient_x( sigma )
gy = tpl.gauss_gradient_y( sigma )
c = Matrix[ [ 1, 0 ], [ 0, 1 ], [ -y, x ] ] * Vector[ gx, gy ]
hs = ( c * c.covector ).collect { |e| e.sum }
                              Tracking
field = MultiArray.new( MultiArray::SFLOAT, w,   h, 2 )
field[ 0...w, 0...h, 0 ] = x * cos( p[2] ) - y   * sin( p[2] ) + p[0]
field[ 0...w, 0...h, 1 ] = x * sin( p[2] ) + y   * cos( p[2] ) + p[1]
diff = img.warp_clipped_interpolate( field ) -   tpl
s = c.collect { |e| ( e * diff ).sum }
d = hs.inverse * s
p += Matrix[ [ cos(p[2]), -sin(p[2]), 0 ],
             [ sin(p[2]), cos(p[2]), 0 ],
             [          0,          0, 1 ] ] *   d


           40/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                   Computer Vision With Ruby
                Interactive Presentation Software




41/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                           Current/Future Work


• feature extraction
  – multiresolution Lucas-Kanade
  – wavelet-based features
• feature descriptors
  – appearance templates
• feature based object recognition
  – geometric hashing
  – RANSAC
• feature based tracking
  – bounded hough transform
• parallel processing

No high-level code in C++!
                  42/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                                    Appeal




Computer vision only will happen if we ...
• break with business as usual
• remove all barriers to collaboration
• allow users and developers to innovate
• need fully hackable hardware
• fight for a free software stack

       43/44
http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf
                           Conclusion
        http://rubyforge.org/projects/hornetseye/

            Let’s do it!




44/44

Más contenido relacionado

Similar a Real-time Computer Vision With Ruby - OSCON 2008

PyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web ApplicationsPyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web ApplicationsGraham Dumpleton
 
Using Smalltalk for controlling robotics systems
Using Smalltalk for controlling robotics systemsUsing Smalltalk for controlling robotics systems
Using Smalltalk for controlling robotics systemsSerge Stinckwich
 
Linux kernel-rootkit-dev - Wonokaerun
Linux kernel-rootkit-dev - WonokaerunLinux kernel-rootkit-dev - Wonokaerun
Linux kernel-rootkit-dev - Wonokaerunidsecconf
 
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...Intel® Software
 
Java code coverage with JCov. Implementation details and use cases.
Java code coverage with JCov. Implementation details and use cases.Java code coverage with JCov. Implementation details and use cases.
Java code coverage with JCov. Implementation details and use cases.Alexandre (Shura) Iline
 
ARM IoT Firmware Emulation Workshop
ARM IoT Firmware Emulation WorkshopARM IoT Firmware Emulation Workshop
ARM IoT Firmware Emulation WorkshopSaumil Shah
 
Automated reduction of attack surface using call graph enumeration
Automated reduction of attack surface using call graph enumerationAutomated reduction of attack surface using call graph enumeration
Automated reduction of attack surface using call graph enumerationRuo Ando
 
Integris Security - Hacking With Glue ℠
Integris Security - Hacking With Glue ℠Integris Security - Hacking With Glue ℠
Integris Security - Hacking With Glue ℠Integris Security LLC
 
How to write clean & testable code without losing your mind
How to write clean & testable code without losing your mindHow to write clean & testable code without losing your mind
How to write clean & testable code without losing your mindAndreas Czakaj
 
Swift profiling middleware and tools
Swift profiling middleware and toolsSwift profiling middleware and tools
Swift profiling middleware and toolszhang hua
 
Fine line between performance and security
Fine line between performance and securityFine line between performance and security
Fine line between performance and securityAlmudena Vivanco
 
Coverage Solutions on Emulators
Coverage Solutions on EmulatorsCoverage Solutions on Emulators
Coverage Solutions on EmulatorsDVClub
 
Voice Enable Blind Assistance System -Real time Object Detection
Voice Enable Blind Assistance System -Real time Object DetectionVoice Enable Blind Assistance System -Real time Object Detection
Voice Enable Blind Assistance System -Real time Object DetectionIRJET Journal
 
Labs_BT_20221017.pptx
Labs_BT_20221017.pptxLabs_BT_20221017.pptx
Labs_BT_20221017.pptxssuserb4d806
 
4_image_detection.pdf
4_image_detection.pdf4_image_detection.pdf
4_image_detection.pdfFEG
 
Improving app performance with Kotlin Coroutines
Improving app performance with Kotlin CoroutinesImproving app performance with Kotlin Coroutines
Improving app performance with Kotlin CoroutinesHassan Abid
 
Demystifying Binary Reverse Engineering - Pixels Camp
Demystifying Binary Reverse Engineering - Pixels CampDemystifying Binary Reverse Engineering - Pixels Camp
Demystifying Binary Reverse Engineering - Pixels CampAndré Baptista
 
Python_for_Visual_Effects_and_Animation_Pipelines
Python_for_Visual_Effects_and_Animation_PipelinesPython_for_Visual_Effects_and_Animation_Pipelines
Python_for_Visual_Effects_and_Animation_PipelinesRussell Darling
 
01 foundations
01 foundations01 foundations
01 foundationsankit_ppt
 

Similar a Real-time Computer Vision With Ruby - OSCON 2008 (20)

PyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web ApplicationsPyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web Applications
 
Using Smalltalk for controlling robotics systems
Using Smalltalk for controlling robotics systemsUsing Smalltalk for controlling robotics systems
Using Smalltalk for controlling robotics systems
 
Linux kernel-rootkit-dev - Wonokaerun
Linux kernel-rootkit-dev - WonokaerunLinux kernel-rootkit-dev - Wonokaerun
Linux kernel-rootkit-dev - Wonokaerun
 
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
 
Java code coverage with JCov. Implementation details and use cases.
Java code coverage with JCov. Implementation details and use cases.Java code coverage with JCov. Implementation details and use cases.
Java code coverage with JCov. Implementation details and use cases.
 
ARM IoT Firmware Emulation Workshop
ARM IoT Firmware Emulation WorkshopARM IoT Firmware Emulation Workshop
ARM IoT Firmware Emulation Workshop
 
Automated reduction of attack surface using call graph enumeration
Automated reduction of attack surface using call graph enumerationAutomated reduction of attack surface using call graph enumeration
Automated reduction of attack surface using call graph enumeration
 
Integris Security - Hacking With Glue ℠
Integris Security - Hacking With Glue ℠Integris Security - Hacking With Glue ℠
Integris Security - Hacking With Glue ℠
 
How to write clean & testable code without losing your mind
How to write clean & testable code without losing your mindHow to write clean & testable code without losing your mind
How to write clean & testable code without losing your mind
 
Swift profiling middleware and tools
Swift profiling middleware and toolsSwift profiling middleware and tools
Swift profiling middleware and tools
 
Fine line between performance and security
Fine line between performance and securityFine line between performance and security
Fine line between performance and security
 
Coverage Solutions on Emulators
Coverage Solutions on EmulatorsCoverage Solutions on Emulators
Coverage Solutions on Emulators
 
Voice Enable Blind Assistance System -Real time Object Detection
Voice Enable Blind Assistance System -Real time Object DetectionVoice Enable Blind Assistance System -Real time Object Detection
Voice Enable Blind Assistance System -Real time Object Detection
 
Labs_BT_20221017.pptx
Labs_BT_20221017.pptxLabs_BT_20221017.pptx
Labs_BT_20221017.pptx
 
4_image_detection.pdf
4_image_detection.pdf4_image_detection.pdf
4_image_detection.pdf
 
Improving app performance with Kotlin Coroutines
Improving app performance with Kotlin CoroutinesImproving app performance with Kotlin Coroutines
Improving app performance with Kotlin Coroutines
 
Demystifying Binary Reverse Engineering - Pixels Camp
Demystifying Binary Reverse Engineering - Pixels CampDemystifying Binary Reverse Engineering - Pixels Camp
Demystifying Binary Reverse Engineering - Pixels Camp
 
Python_for_Visual_Effects_and_Animation_Pipelines
Python_for_Visual_Effects_and_Animation_PipelinesPython_for_Visual_Effects_and_Animation_Pipelines
Python_for_Visual_Effects_and_Animation_Pipelines
 
01 foundations
01 foundations01 foundations
01 foundations
 
Vhdl new
Vhdl newVhdl new
Vhdl new
 

Más de Jan Wedekind

The SOLID Principles for Software Design
The SOLID Principles for Software DesignThe SOLID Principles for Software Design
The SOLID Principles for Software DesignJan Wedekind
 
Fundamentals of Computing
Fundamentals of ComputingFundamentals of Computing
Fundamentals of ComputingJan Wedekind
 
Using Generic Image Processing Operations to Detect a Calibration Grid
Using Generic Image Processing Operations to Detect a Calibration GridUsing Generic Image Processing Operations to Detect a Calibration Grid
Using Generic Image Processing Operations to Detect a Calibration GridJan Wedekind
 
Efficient implementations of machine vision algorithms using a dynamically ty...
Efficient implementations of machine vision algorithms using a dynamically ty...Efficient implementations of machine vision algorithms using a dynamically ty...
Efficient implementations of machine vision algorithms using a dynamically ty...Jan Wedekind
 
The MiCRoN Project
The MiCRoN ProjectThe MiCRoN Project
The MiCRoN ProjectJan Wedekind
 
Computer vision for microscopes
Computer vision for microscopesComputer vision for microscopes
Computer vision for microscopesJan Wedekind
 
Focus set based reconstruction of micro-objects
Focus set based reconstruction of micro-objectsFocus set based reconstruction of micro-objects
Focus set based reconstruction of micro-objectsJan Wedekind
 
Machine vision and device integration with the Ruby programming language (2008)
Machine vision and device integration with the Ruby programming language (2008)Machine vision and device integration with the Ruby programming language (2008)
Machine vision and device integration with the Ruby programming language (2008)Jan Wedekind
 
Reconstruction (of micro-objects) based on focus-sets using blind deconvoluti...
Reconstruction (of micro-objects) based on focus-sets using blind deconvoluti...Reconstruction (of micro-objects) based on focus-sets using blind deconvoluti...
Reconstruction (of micro-objects) based on focus-sets using blind deconvoluti...Jan Wedekind
 
Fokus-serien basierte Rekonstruktion von Mikroobjekten (2002)
Fokus-serien basierte Rekonstruktion von Mikroobjekten (2002)Fokus-serien basierte Rekonstruktion von Mikroobjekten (2002)
Fokus-serien basierte Rekonstruktion von Mikroobjekten (2002)Jan Wedekind
 
Play Squash with Ruby, OpenGL, and a Wiimote - ShRUG Feb 2011
Play Squash with Ruby, OpenGL, and a Wiimote - ShRUG Feb 2011Play Squash with Ruby, OpenGL, and a Wiimote - ShRUG Feb 2011
Play Squash with Ruby, OpenGL, and a Wiimote - ShRUG Feb 2011Jan Wedekind
 
Digital Imaging with Free Software - Talk at Sheffield Astronomical Society J...
Digital Imaging with Free Software - Talk at Sheffield Astronomical Society J...Digital Imaging with Free Software - Talk at Sheffield Astronomical Society J...
Digital Imaging with Free Software - Talk at Sheffield Astronomical Society J...Jan Wedekind
 
Machine Vision made easy with Ruby - ShRUG June 2010
Machine Vision made easy with Ruby - ShRUG June 2010Machine Vision made easy with Ruby - ShRUG June 2010
Machine Vision made easy with Ruby - ShRUG June 2010Jan Wedekind
 
Computer Vision using Ruby and libJIT - RubyConf 2009
Computer Vision using Ruby and libJIT - RubyConf 2009Computer Vision using Ruby and libJIT - RubyConf 2009
Computer Vision using Ruby and libJIT - RubyConf 2009Jan Wedekind
 
Object Recognition and Real-Time Tracking in Microscope Imaging - IMVIP 2006
Object Recognition and Real-Time Tracking in Microscope Imaging - IMVIP 2006Object Recognition and Real-Time Tracking in Microscope Imaging - IMVIP 2006
Object Recognition and Real-Time Tracking in Microscope Imaging - IMVIP 2006Jan Wedekind
 
Steerable Filters generated with the Hypercomplex Dual-Tree Wavelet Transform...
Steerable Filters generated with the Hypercomplex Dual-Tree Wavelet Transform...Steerable Filters generated with the Hypercomplex Dual-Tree Wavelet Transform...
Steerable Filters generated with the Hypercomplex Dual-Tree Wavelet Transform...Jan Wedekind
 
Ruby & Machine Vision - Talk at Sheffield Hallam University Feb 2009
Ruby & Machine Vision - Talk at Sheffield Hallam University Feb 2009Ruby & Machine Vision - Talk at Sheffield Hallam University Feb 2009
Ruby & Machine Vision - Talk at Sheffield Hallam University Feb 2009Jan Wedekind
 

Más de Jan Wedekind (17)

The SOLID Principles for Software Design
The SOLID Principles for Software DesignThe SOLID Principles for Software Design
The SOLID Principles for Software Design
 
Fundamentals of Computing
Fundamentals of ComputingFundamentals of Computing
Fundamentals of Computing
 
Using Generic Image Processing Operations to Detect a Calibration Grid
Using Generic Image Processing Operations to Detect a Calibration GridUsing Generic Image Processing Operations to Detect a Calibration Grid
Using Generic Image Processing Operations to Detect a Calibration Grid
 
Efficient implementations of machine vision algorithms using a dynamically ty...
Efficient implementations of machine vision algorithms using a dynamically ty...Efficient implementations of machine vision algorithms using a dynamically ty...
Efficient implementations of machine vision algorithms using a dynamically ty...
 
The MiCRoN Project
The MiCRoN ProjectThe MiCRoN Project
The MiCRoN Project
 
Computer vision for microscopes
Computer vision for microscopesComputer vision for microscopes
Computer vision for microscopes
 
Focus set based reconstruction of micro-objects
Focus set based reconstruction of micro-objectsFocus set based reconstruction of micro-objects
Focus set based reconstruction of micro-objects
 
Machine vision and device integration with the Ruby programming language (2008)
Machine vision and device integration with the Ruby programming language (2008)Machine vision and device integration with the Ruby programming language (2008)
Machine vision and device integration with the Ruby programming language (2008)
 
Reconstruction (of micro-objects) based on focus-sets using blind deconvoluti...
Reconstruction (of micro-objects) based on focus-sets using blind deconvoluti...Reconstruction (of micro-objects) based on focus-sets using blind deconvoluti...
Reconstruction (of micro-objects) based on focus-sets using blind deconvoluti...
 
Fokus-serien basierte Rekonstruktion von Mikroobjekten (2002)
Fokus-serien basierte Rekonstruktion von Mikroobjekten (2002)Fokus-serien basierte Rekonstruktion von Mikroobjekten (2002)
Fokus-serien basierte Rekonstruktion von Mikroobjekten (2002)
 
Play Squash with Ruby, OpenGL, and a Wiimote - ShRUG Feb 2011
Play Squash with Ruby, OpenGL, and a Wiimote - ShRUG Feb 2011Play Squash with Ruby, OpenGL, and a Wiimote - ShRUG Feb 2011
Play Squash with Ruby, OpenGL, and a Wiimote - ShRUG Feb 2011
 
Digital Imaging with Free Software - Talk at Sheffield Astronomical Society J...
Digital Imaging with Free Software - Talk at Sheffield Astronomical Society J...Digital Imaging with Free Software - Talk at Sheffield Astronomical Society J...
Digital Imaging with Free Software - Talk at Sheffield Astronomical Society J...
 
Machine Vision made easy with Ruby - ShRUG June 2010
Machine Vision made easy with Ruby - ShRUG June 2010Machine Vision made easy with Ruby - ShRUG June 2010
Machine Vision made easy with Ruby - ShRUG June 2010
 
Computer Vision using Ruby and libJIT - RubyConf 2009
Computer Vision using Ruby and libJIT - RubyConf 2009Computer Vision using Ruby and libJIT - RubyConf 2009
Computer Vision using Ruby and libJIT - RubyConf 2009
 
Object Recognition and Real-Time Tracking in Microscope Imaging - IMVIP 2006
Object Recognition and Real-Time Tracking in Microscope Imaging - IMVIP 2006Object Recognition and Real-Time Tracking in Microscope Imaging - IMVIP 2006
Object Recognition and Real-Time Tracking in Microscope Imaging - IMVIP 2006
 
Steerable Filters generated with the Hypercomplex Dual-Tree Wavelet Transform...
Steerable Filters generated with the Hypercomplex Dual-Tree Wavelet Transform...Steerable Filters generated with the Hypercomplex Dual-Tree Wavelet Transform...
Steerable Filters generated with the Hypercomplex Dual-Tree Wavelet Transform...
 
Ruby & Machine Vision - Talk at Sheffield Hallam University Feb 2009
Ruby & Machine Vision - Talk at Sheffield Hallam University Feb 2009Ruby & Machine Vision - Talk at Sheffield Hallam University Feb 2009
Ruby & Machine Vision - Talk at Sheffield Hallam University Feb 2009
 

Último

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Último (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Real-time Computer Vision With Ruby - OSCON 2008

  • 1. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf O’Reilly Open Source Convention 2008 Ruby Track: Session 2471 Real-time Computer Vision With Ruby J. Wedekind Wednesday, July 23rd 2008 Nanorobotics EPSRC Basic Technology Grant Microsystems and Machine Vision Laboratory Modelling Research Centre 1/44
  • 2. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Introduction In This Talk Brain.eval <<REQUIRED require ’RMagick’ require ’Qt4’ require ’complex’ require ’matrix’ require ’narray’ REQUIRED 2/44
  • 3. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Introduction EU Esprit MINIMAN Project 3/44
  • 4. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Introduction EU IST MiCRoN Project 4/44
  • 5. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Introduction EPSRC Nanorobotics Project Electron Microscopy • telemanipulation • drift-compensation • closed-loop control Computer Vision • real-time software • system integration • theoretical insights 5/44
  • 6. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Introduction Industrial Robotics Default Situation • proprietary operating system • proprietary robot software • proprietary process simulation software • proprietary mathematics software • proprietary machine vision software • proprietary manufacturing software Total Cost of Lock-in (TCL) • duplication of work • integration problems • lack of progress • handicapped developers 6/44
  • 7. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Introduction Innovation Happens Elsewhere Ron Goldman & Richard P. Gabriel “The market need is greatest for platform products because of the importance of a reliable promise that vendor lock-in will not endanger the survival of products built or modified on the software stack above that platform.” “It is important to remove as many barriers to collaboration as possible: social, political, and technical.” 7/44
  • 8. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Design Considerations HornetsEye’s Distinguishing Features • GPL • Ruby • Real-Time 8/44
  • 9. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Design Considerations GPLv3 Four Freedoms (Richard Stallman) 1. The freedom to run the program, for any purpose. 2. The freedom to study how the program works, and adapt it to your needs. 3. The freedom to redistribute copies so you can help your neighbor. 4. The freedom to improve the program, and release your improvements to the public, so that the whole community benefits. Respect The Freedom Of Downstream Users (Richard Stallman) GPL requires derived works to be available under the same license. Covenant Not To Assert Patent Claims (Eben Moglen) GPLv3 deters users of the program from instituting patent ligitation by the threat of withdrawing further rights to use the program. Other (Eben Moglen) GPLv3 has regulations against DMCA restrictions and tivoization. 9/44
  • 10. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Design Considerations Ruby # --------------------------------------------------------------------------------------------------------- img = Magick::Image.read( "circle.png" )[ 0 ] str = img.export_pixels_to_str( 0, 0, img.columns, img.rows, "I", Magick::CharPixel ) arr = NArray.to_na( str, NArray::BYTE, img.columns, img.rows ) puts ( arr / 128 ).inspect # NArray.byte(20,20): # [ [ 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1 ], # [ 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1 ], # [ 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1 ], # [ 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1 ], # [ 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1 ], # [ 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1 ], # [ 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1 ], # [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ], # [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ], # ... # --------------------------------------------------------------------------------------------------------- No high-level code in C++! 10/44
  • 11. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Design Considerations Real-Time Real-time Object Recognition 11/44
  • 12. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye Core Compact Storage # ------------------------------------------------------------------------------------------------------ class Sequence Type = Struct.new( :name, :type, :size, :default, :pack, :unpack ); @@types = [] def Sequence.register_type( sym, type, size, default, pack, unpack ) eval "#{sym.to_s} = Type.new( sym.to_s, type, size, default, pack, unpack )" end register_type( :OBJECT, Object, 1, nil, proc { |o| [o] }, proc { |s| s[0] } ) register_type( :UBYTE, Fixnum, 1, 0, proc { |o| [o].pack("C") }, proc { |s| s.unpack("C")[0] } ) def initialize( type = OBJECT, n = 0, value = nil ) @type, @data = type, type.pack.call( value == nil ? type.default : value ) * n @size = n end def []( i ) p = i * @type.size; @type.unpack.call( @data[ p...( p + @type.size ) ] ) end def []=( i, o ) p = i * @type.size; @data[ p...( p + @type.size ) ] = @type.pack.call( o ); o end end # ------------------------------------------------------------------------------------------------------ 12/44
  • 13. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye Core N-Dimensional Arrays # ------------------------------------------------------------------------------------------------------ class MultiArray UBYTE = Sequence::UBYTE OBJECT = Sequence::OBJECT def initialize( type = OBJECT, *shape ) @shape = shape stride = 1 @strides = shape.collect { |s| old = stride; stride *= s; old } @data = Sequence.new( type, shape.inject( 1 ) { |r,d| r*d } ) end def []( *indices ) @data[ indices.zip( @strides ).inject( 0 ) { |p,i| p + i[0] * i[1] } ] end def []=( *indices ) value = indices.pop @data[ indices.zip( @strides ).inject( 0 ) { |p,i| p + i[0] * i[1] } ] = value end end # ------------------------------------------------------------------------------------------------------ 13/44
  • 14. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye Core Element-wise Operations # ------------------------------------------------------------------------------------------------------ class Sequence attr_reader :type, :data, :size def collect( type = @type ) retval = Sequence.new( type, @size ) ( 0...@size ).each { |i| retval[i] = yield self[i] } retval end end class MultiArray attr_accessor :shape, :strides, :data def MultiArray.import( type, data, *shape ) retval = MultiArray.new( type ) stride = 1; retval.strides = shape.collect { |s| old = stride; stride *= s; old } retval.shape, retval.data = shape, data; retval end def collect( type = @data.type, &action ) MultiArray.import( type, @data.collect( type, &action ), *@shape ) end end # ------------------------------------------------------------------------------------------------------ 14/44
  • 15. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye Core Return-type Coercions # ------------------------------------------------------------------------------------------------------ class Sequence @@coercions = Hash.new @@coercions.default = OBJECT def Sequence.register_coercion( result, type1, type2 ) @@coercions[ [ type1, type1 ] ] = type1 @@coercions[ [ type2, type2 ] ] = type2 @@coercions[ [ type1, type2 ] ] = result @@coercions[ [ type2, type1 ] ] = result end register_coercion( OBJECT, OBJECT, UBYTE ) def +( other ) retval = Sequence.new( @@coercions[ [ @type, other.type ] ], @size ) ( 0...@size ).each { |i| retval[i] = self[i] + other[i] } retval end end # ------------------------------------------------------------------------------------------------------ 15/44
  • 16. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye Core Fast Malloc Objects VALUE Malloc::wrapMid( VALUE rbSelf, VALUE rbOffset, VALUE rbLength ) { char *self; Data_Get_Struct( rbSelf, char, self ); return rb_str_new( self + NUM2INT( rbOffset ), NUM2INT( rbLength ) ); } m=Malloc.new(1000) m.mid(10,4) # "000000000000" m.assign(10,"test") m.mid(10,4) # "test" 16/44
  • 17. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye Core Native Operations g,h ∈ {0, 1,  . .  w − 1} × {0, 1, . . . , h − 1} → R  . ,  x1   ) = g( x1 )/2     MultiArray.dfloat( 320, 240 ):   h(            MultiArray.dfloat [ [ 245.0, 244.0, 197.0, ... ],   x    x  2 2 Fixnum [ 245.0, 247.0, 197.0, ... ], [ 247.0, 248.0, 187.0, ... ] Ruby h=g/2 ... [3.141] MultiArray.respond to?( ”binary div lint dfloat” ) no String.unpack(”D”) Array.pack(”D”) h = g.collect { |x| x / 2 } yes C++ for ( int i=0; i<n; i++ ) *r++ = *p++ / q; MultiArray.binary div byte byte MultiArray.binary div byte bytergb ”x54xE3xA5x9BxC4x20x09x40” MultiArray.binary div byte dcomplex MultiArray.binary div byte dfloat MultiArray.binary div byte dfloatrgb ... 17/44
  • 18. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O Colourspace Conversions          Y   0.299       0.587 0.114    R  0                           =       Cb  −0.168736 −0.331264    +          0.500  G 128                                                  C   0.500 r −0.418688 −0.081312  B  128 also see: http://fourcc.org/ 18/44
  • 19. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O BMP, GIF, JPEG, PPM, PNG, PNM, TIFF, ... mgk = Magick::Image.read( "circle.png" )[0] # code is simplified str = magick.export_pixels_to_str( 0, 0, mgk.columns, mgk.rows, "RGB", Magick::CharPixel ) arr = MultiArray.import( MultiArray::UBYTERGB, str, mgk.columns, mgk.rows ) ⇓ arr = MultiArray.load_rgb24( "circle.png" ) mgk = Magick::Image.new( *arr.shape ) { |x| x.depth = 8 } mgk.import_pixels( 0, 0, arr.shape[0], arr.shape[1], "RGB", arr.to_s, Magick::CharPixel ) Magick::ImageList.new.push( mgk ).write( "circle.png" ) ⇓ arr = MultiArray.save_rgb24( "circle.png" ) 19/44
  • 20. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O Video Decoding xine_t *m_xine = xine_new(); // code is simplified xine_config_load( m_xine, "/home/myusername/.xine/config" ); xine_init( m_xine ); xine_video_port_t *m_videoPort = xine_new_framegrab_video_port( m_xine ); xine_stream_t *m_stream = xine_stream_new( m_xine, NULL, m_videoPort ); xine_open( m_stream, "test.avi" ); xine_video_frame_t *m_frame; xine_get_next_video_frame( m_videoPort, &m_frame ); xine_free_video_frame( m_videoPort, &m_frame ); ⇓ xine = XineInput.new( "test.avi" ) img = xine.read 20/44
  • 21. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O Video Encoding // "const unsigned char *data" points to I420-data of 320x240 frame FILE *m_control = popen( "mencoder - -o test.avi" // code is simplified " -ovc lavc -lavcopts vcodec=ffv1", "w" ); fprintf( m_control, "YUV4MPEG2 W320 H240 F25000000:1000000 Ip A0:0n" ); fprintf( m_control, "FRAMEn" ); fwrite( data, 320 * 240 * 3 / 2, 1, m_control ); ⇓ # "img" is of type "HornetsEye::Image" or "HornetsEye::MultiArray" mencoder = MEncoderOutput.new( "test.avi", 25 ) mencoder.write( img ) 21/44
  • 22. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O Video For Linux (V4L/V4L2) int m_fd = open( "/dev/video0", O_RDWR, 0 ); // code is incomplete ioctl( VIDIOC_S_FMT, &m_format ); ioctl( VIDIOC_REQBUFS, &m_req ); ioctl( VIDIOC_QUERYBUF, &m_buf[0] ); ioctl( VIDIOC_QBUF, &m_buf[0] ); ioctl( VIDIOC_STREAMON, &type ); ioctl( VIDIOC_DQBUF, &buf ) // ... ⇓ v4l2 = V4L2Input.new img = v4l2.read 22/44
  • 23. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O IIDC/DCAM (libdc1394) raw1394handle_t m_handle = dc1394_create_handle( 0 ); // code is incomplete dc1394_cameracapture m_camera; int numCameras; nodeid_t *m_cameraNode = dc1394_get_camera_nodes( m_handle, &numCameras, 0 ); dc1394_camera_on( m_handle, 0 ); dc1394_dma_setup_capture( m_handle, m_cameraNode[ 0 ], 0, FORMAT_VGA_NONCOMPRESSED, MODE_640x480_YUV422, FRAMERATE_15, 4, 1, NULL, &m_camera ); dc1394_start_iso_transmission( m_handle, m_camera.node ); // ... ⇓ firewire = DC1394Input.new img = firewire.read 23/44
  • 24. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O XPutImage/glDrawPixels # ---------------------------------------------------------- img = MultiArray.load_rgb24( "howden.jpg" ) display = X11Display.new output = XImageOutput.new # output = OpenGLOutput.new window = X11Window.new( display, output, 320, 240 ) window.title = "Test" output.write( img ) window.show display.eventLoop # ---------------------------------------------------------- 24/44
  • 25. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O XvPutImage # ----------------------------------------------------- xine = XineInput.new( "dvd://1" ); sleep 2 display = X11Display.new output = XVideoOutput.new window = X11Window.new( display, output, 768, 576 * 9 / 16 ) window.title = "Test" window.show delay = xine.frame_duration.to_f / 90000.0 time = Time.now while xine.status? and output.status? output.write( xine.read ) time_left = delay - ( Time.now.to_f - time.to_f ) display.eventLoop( time_left * 1000 ) time += delay end # ----------------------------------------------------- 25/44
  • 26. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O High Dynamic Range Images Exposure Series Alignment (Hugin) Tonemapping (QtPfsGui) ⇒ Loading And Saving ⇒ img = MultiArray. load_rgbf("test.exr") img. save_rgbf("test.exr") 26/44
  • 27. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O Qt4-QtRuby: G++ Dataflow 27/44
  • 28. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O Qt4-QtRuby: Ruby Dataflow 28/44
  • 29. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O Qt4-QtRuby: XVideo Integration # --------------------------------------------------------------------------- class VideoPlayer < Qt::Widget def initialize super @xvideo = Hornetseye::XvWidget.new( self ) layout = Qt::VBoxLayout.new( self ) layout.addWidget( @xvideo ) @xine = Hornetseye::XineInput.new( "test.avi", false ) @timer = startTimer( @xine.frame_duration * 1000 / 90000 ) resize( 640, 400 ) end def timerEvent( e ) begin if @xine img = @xine.read @xvideo.write( img ) end rescue @xine = nil killTimer( @timer ) @xvideo.clear @timer = 0 end end end app = Qt::Application.new( ARGV ) VideoPlayer.new.show app.exec # --------------------------------------------------------------------------- 29/44
  • 30. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf HornetsEye I/O Microsoft Windows / V4LInput VFWInput V4L2Input DShowInput DC1394Input — XineInput — MPlayerInput MPlayerInput MEncoderOutput MEncoderOutput X11Display W32Display X11Window W32Window XImageOutput GDIOutput OpenGLOutput — XVideoOutput — 30/44
  • 31. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Introduction From Here On Brain.eval <<REQUIRED require ’hornetseye’ include Hornetseye REQUIRED 31/44
  • 32. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby Look-Up-Tables (LUTs) g ∈ {0, 1, . . . , w} × {0, 1, . . . , h} → {0, 1, . . . , 255} m ∈ {0, 1, . . . , 255} → {0, 1, . . . , 255}3      x1   ) = m g( x1 )         h(      x          x  2 2 # ----------------------------------------------------------------------- img = MultiArray.load_grey8( "test.jpg" ) class Numeric def clip( range ) [ [ self, range.begin ].max, range.end ].min end end colours = {} for i in 0...256 hue = 240 - i * 240.0 / 256.0 colours[i] = RGB( ( ( hue - 180 ).abs - 60 ).clip( 0...60 ) * 255 / 60.0, ( 120 - ( hue - 120 ).abs ).clip( 0...60 ) * 255 / 60.0, ( 120 - ( hue - 240 ).abs ).clip( 0...60 ) * 255 / 60.0 ) end img.map( colours, MultiArray::UBYTERGB, 256 ).display # ----------------------------------------------------------------------- 32/44
  • 33. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby Masks And Ramps # ----------------------------------------------------------------------- class MultiArray def MultiArray.ramp1( *shape ) retval = MultiArray.new( MultiArray::LINT, *shape ) for x in 0...shape[0] retval[ x, 0...shape[1] ] = x Compute Bounding Box end retval end # def MultiArray.ramp2 ... end input = V4LInput.new x, y = MultiArray.ramp1( input.width, input.height ), MultiArray.ramp2( input.width, input.height ) display = X11Display.new output = XVideoOutput.new window = X11Window.new( display, output, 640, 480 ) 0 1 2 3 4 5 window.title = "Thresholding" 0 1 2 3 4 5 window.show 0 1 2 3 4 5 while input.status? and output.status? 0 1 2 3 4 5 img = input.read_grey8 0 1 2 3 4 5 mask = img.binarise_lt( 48 ) 0 1 2 3 4 5 result = ( img / 4 ) * ( mask + 1 ) if mask.sum > 0 2 3 4 2 3 1 2 3 2 3 bbox = [ x.mask( mask ).range, y.mask( mask ).range ] result[ *bbox ] *= 2 1≤x≤4 end output.write( result ) display.processEvents end # ----------------------------------------------------------------------- 33/44
  • 34. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby Warps (Use Image As LUT)       g W( x1 )  x1  g ∈ {0, 1, . . . , w − 1} × {0, 1, . . . , h − 1} → R3      if W( ) ∈ {0, 1, . . . , w − 1} × {0, 1, . . . , h − 1}          x1         h ∈ {0, 1, . . . , w − 1} × {0, 1, . . . , h − 1} → R3  ) =          h(       x    x     2 2 W ∈ {0, 1, . . . , w − 1} × {0, 1, . . . , h − 1} → Z2 x    2    0  otherwise # ---------------------------------------------------------------------- class MultiArray # def MultiArray.ramp1 ... def MultiArray.ramp2( *shape ) retval = MultiArray.new( MultiArray::LINT, *shape ) for y in 0...shape[1] retval[ 0...shape[0], y ] = y end retval end end img = MultiArray.load_rgb24( "test.jpg" ) w, h = *img.shape; c = 0.5 * h x, y = MultiArray.ramp1( h, h ), MultiArray.ramp2( h, h ) warp = MultiArray.new( MultiArray::LINT, h, h, 2 ) warp[ 0...h, 0...h, 0 ], warp[ 0...h, 0...h, 1 ] = ( ( ( x - c ).atan2( y - c ) / Math::PI + 1 ) * w / 2 - 0.5 ), ( ( x - c ) ** 2 + ( y - c ) ** 2 ).sqrt img.warp_clipped( warp ).display # ---------------------------------------------------------------------- 34/44
  • 35. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby Affine Transform using Warps       # --------------------------------------------------------------------  x1  cos(α) − sin(α)  x1   ) =      Wα (         class MultiArray              x   sin(α) cos(α)         x  def MultiArray.ramp1( *shape ) 2 2 retval = MultiArray.new( MultiArray::LINT, *shape ) for x in 0...shape[0] retval[ x, 0...shape[1] ] = x end retval end # def MultiArray.ramp2 ... end img = MultiArray.load_rgb24( "test.jpg" ) w, h = *img.shape v = Vector[ MultiArray.ramp1( w, h ) - w / 2, MultiArray.ramp2( w, h ) - h / 2 ] angle = 30.0 * Math::PI / 180.0 m = Matrix[ [ Math::cos( angle ), -Math::sin( angle ) ], [ Math::sin( angle ), Math::cos( angle ) ] ] warp = MultiArray.new( MultiArray::LINT, w, h, 2 ) warp[ 0...w, 0...h, 0 ], warp[ 0...w, 0...h, 1 ] = ( m * v )[0] + w / 2, ( m * v )[1] + h / 2 img.warp_clipped( warp ).display # -------------------------------------------------------------------- 35/44
  • 36. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby Center Of Gravity And Principal Components 36/44
  • 37. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby Linear Shift-Invariant Filters Input Image Sharpen Gaussian Blur Gauss-Gradient (X) Gauss-Gradient (Y) 37/44
  • 38. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby Edge- And Corner-Images Input Image Sobel Gauss-Gradient Harris-Stephens Kanade-Lucas-Tomasi 38/44
  • 39. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby (Inverse Compositional) Lucas-Kanade given: template T , image I , previous pose p sought: pose-change ∆p argmin ||T (x) − I(W p (W∆p (x)))||2 dx = (∗) −1 −1 −1 −1 ∆p x∈T I(W p (W∆p (x))) (1) T (x) − I(W p (W∆p (x))) = T (W∆p (x)) − I(W p (x)) −1 −1 −1 −1 I(W p (x)) δT T δW p (2) T (W∆p (x)) ≈ T (x) + (x) · (x) · ∆p δx δp (1,2) (∗) = argmin(||H p + b||2 ) = (H T H)−1 H T b p  h1,1 h1,2 · · ·    b1  T (x)                 where H =  2,1 h2,2 · · ·  and b = b    h            2      . . . ..       . .  . .     . . .     δT T δW p hi, j = (xi ) · (xi ) , bi = T (xi ) − I(W p (xi )) −1 δx δp j S. Baker and I. Matthew: “Lucas-Kanade 20 years on: a unifying framework” http://www.ri.cmu.edu/projects/project 515.html 39/44
  • 40. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby 3 Degrees-Of-Freedom Lucas-Kanade Initialisation p = Vector[ xshift, yshift, rotation ] w, h, sigma = tpl.shape[0], tpl.shape[1], 5.0 x, y = xramp( w, h ), yramp( w, h ) gx = tpl.gauss_gradient_x( sigma ) gy = tpl.gauss_gradient_y( sigma ) c = Matrix[ [ 1, 0 ], [ 0, 1 ], [ -y, x ] ] * Vector[ gx, gy ] hs = ( c * c.covector ).collect { |e| e.sum } Tracking field = MultiArray.new( MultiArray::SFLOAT, w, h, 2 ) field[ 0...w, 0...h, 0 ] = x * cos( p[2] ) - y * sin( p[2] ) + p[0] field[ 0...w, 0...h, 1 ] = x * sin( p[2] ) + y * cos( p[2] ) + p[1] diff = img.warp_clipped_interpolate( field ) - tpl s = c.collect { |e| ( e * diff ).sum } d = hs.inverse * s p += Matrix[ [ cos(p[2]), -sin(p[2]), 0 ], [ sin(p[2]), cos(p[2]), 0 ], [ 0, 0, 1 ] ] * d 40/44
  • 41. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Computer Vision With Ruby Interactive Presentation Software 41/44
  • 42. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Current/Future Work • feature extraction – multiresolution Lucas-Kanade – wavelet-based features • feature descriptors – appearance templates • feature based object recognition – geometric hashing – RANSAC • feature based tracking – bounded hough transform • parallel processing No high-level code in C++! 42/44
  • 43. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Appeal Computer vision only will happen if we ... • break with business as usual • remove all barriers to collaboration • allow users and developers to innovate • need fully hackable hardware • fight for a free software stack 43/44
  • 44. http://vision.eng.shu.ac.uk/jan/oscon08-foils.pdf Conclusion http://rubyforge.org/projects/hornetseye/ Let’s do it! 44/44