Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Getting started with Perl XS and Inline::C
1. “We should forget about the small efficiencies, say
about 97% of the time: premature optimization is
the root of all evil.”
-Donald Knuth
2. “We should forget about the small efficiencies, say
about 97% of the time: premature optimization is
the root of all evil. Yet we should not pass up our
opportunities in that critical 3%.”
-Donald Knuth
Structured Programming with Go To Statements
3. Getting Started with XS and Inline::C
The eminently palatable approach to Perl XS
Dave Oswald
daoswald@gmail.com
davido@cpan.org
4. What is XS?
●
XS is an interface description file format used to create an extension interface between Perl
and C code (or a C library) which one wishes to use with Perl.
–
perldoc perlxs
5. What is XS?
●
XS is an interface description file format used to create an extension interface between Perl
and C code (or a C library) which one wishes to use with Perl.
–
●
perldoc perlxs
It's a “complex and awkward intermediary language.”
●
Simon Cozens (co-author: Extending and Embedding Perl), in Advanced Perl
Programming, 2nd Edition.
6. What is XS?
●
XS is an interface description file format used to create an extension interface between Perl
and C code (or a C library) which one wishes to use with Perl.
–
●
perldoc perlxs
“Hooking Perl to C using XS requires you to write a shell .pm module to bootstrap an
object file that has been compiled from C code, which was in turn generated by xsubpp
from a .xs source file containing pseudo-C annotated with an XS interface description.
If that sounds horribly complicated, then you have achieved an accurate understanding of
the use of xsubpp.”
–
Damian Conway in Perl Best Practices
7. What is XS?
●
XS is an interface description file format used to create an extension interface between Perl
and C code (or a C library) which one wishes to use with Perl.
–
●
perldoc perlxs
“There is a big fat learning curve involved with setting up and using the XS environment.”
– Brian “Ingy” Ingerson (Inline POD)
8. What is XS?
●
XS is an interface description file format used to create an extension interface between Perl
and C code (or a C library) which one wishes to use with Perl.
–
●
perldoc perlxs
“The Cognitive Load is high, which is counter-indicated for producing cleanly
implemented, bug-free code.”
–
Dave Oswald
9. “...and then he said "btw, do you know C++?" and
I do, though I prefer to work in perl, and he said
"good, cuz these log files might be terrabytes big,
and in that case we're going to need c++"”
– reyjyar (http://perlmonks.org/?node=38117)
10. “Tell him you'll write a prototype in Perl, and if
that turns out not to be fast enough, you'll recode
the critical parts in C, thanks to Perl's
sophisticated interlanguage binding and runtime
profiling tools.”
– Randal Schwartz (http://perlmonks.org/?node=38180)
13. A “Shortest Path”
Use Inline::C to prototype the code.
Paste output (with minor modifications) into an XS
framework for release.
14. Ingy got “fed up with writing XS.”
– Advanced Perl Programming, 2nd edition.
15. What IS Inline::C
●
●
●
Inline::C was created by Ingy döt
Net
●
Inline::C creates XS, code but you
mostly don't need to care how.
●
Subsequent runs use the
previously compiled dynamic
library.
Embed C in your Perl application
●
●
On first run it compiles the C
source code.
Inline::C creates XS bindings,
making it available to your
application.
●
Compiled code is cached unless a
change is made to the C source.
Modules build on install, so no delay
for end user once installed.
16. Inline::C: “Hello World!”
●
From the Inline::C-Cookbook (on CPAN), Hello World in a
HERE doc.
use Inline C
=> << 'END_C';
void greet() {
printf( “Hello, worldn”);
}
END_C
greet();
17. Another simple example...
●
Also from the Inline::C-Cookbook, on CPAN
perl -e 'use Inline C=>q/void greet(){printf(
“Hello, worldn”);}/; greet'
●
“A concept is only valid in Perl if it can be shown to work on
one line.” – Inline::C POD
18. What does that look like in XS?
C:UsersDave_Inlinebuilde_101c>more e_101c.xs
#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"
#include "INLINE.h"
void greet() { printf("Hello, worldn"); }
MODULE = e_101c PACKAGE = main
PROTOTYPES: DISABLE
void
greet ()
PREINIT:
I32* temp;
PPCODE:
temp = PL_markstack_ptr++;
greet();
if (PL_markstack_ptr != temp) {
/* truly void, because dXSARGS not invoked */
PL_markstack_ptr = temp;
XSRETURN_EMPTY; /* return empty stack */
}
/* must have used dXSARGS; list context implied */
return; /* assume stack size is correct */
19. The __C__ segment
●
At the end of a Perl program, add an __END__ or __DATA__ tag,
and on a subsequent line add a __C__ tag.
●
●
Everything after that will be treated as C code.
An example:
use strict;
use warnings;
use Inline C => 'DATA'; #'DATA' is actually the default.
greet();
__DATA__
__C__
void greet() {
printf “Hello world!n”;
}
20. Batteries Included: Basic Data Types.
●
Inline::C already knows how to deal with passing basic C data types as parameters
and return values.
●
●
See perl/lib/ExtUtils/typemap for the full list.
Let's pass C ints into and out of a function.
use Inline 'C';
my ( $apples, $oranges ) = ( 10, 5 );
my $total_fruit = add(
$apples, $oranges );
print “$apples apples plus $oranges oranges equals “
. “$total_fruit pieces of fruit.n”;
__END__
__C__
int add( int first, int second ) { return first + second; }
21. Returning a string is almost as easy (ignoring
Unicode for now)
use Readonly;
use Inline C => 'DATA';
Readonly my $string => return_string();
print $string;
__DATA__
__C__
char* return_string() {
return “Hello world!n”;
}
22. It's not going to stay that easy forever
●
If your data type isn't in the typemap file, it doesn't get converted
automatically.
●
●
You may create additional typemaps of your own creation.
If you try to pass a type that isn't auto-converted, you will get a strange
error message at compile-time.
int clength( char const *str ) { … }
Use of inherited AUTOLOAD for non-method
main::clength() is deprecated at simple params2.pl
line 11.
●
This and other cryptic messages will taunt you often.
23. A quick definition of terms
●
Perl has the following containers:
●
●
AV = Array Value: Contains Scalars.
●
●
SV = Scalar Value
HV = Hash Value: Contains Scalars.
SV's can contain one (or often more) of the following:
●
●
UV = Unsigned Int
●
NV = double
●
PV = string
●
CV = A coderef (subref)
●
●
IV = Integer Value (or pointer)
RV = a pointer to another SV (*SV) (Source: perlguts)
I'm afraid it's time to read perldoc perlguts.
●
●
perlguts introduces the components that comprise the Perl API
The functions documented in perlguts will be your means of
manipulating Perl data in C.
24. The four scenarios
●
These four calling scenarios are documented in Inline::C
●
Simple: Fixed number of params, all types built into the typemap:
●
●
●
All arguments and single return value are specified in the typemap, so you just write
your pretty little C subroutine and life is good. The conversions are automatic.
Return a list or “nothing”. Parameter list fixed.
●
●
●
int Foo ( int arg1, char* arg2, SV* arg3 )
void Foo( int arg1, char* arg2, SV* arg3 )
Either you really want to return nothing, or you want to build the return value yourself
and push it onto “The Stack.” Either way you have to be explicit.
Variable length parameter list, simple return.
●
●
●
char* Foo( SV* arg1, … )
Pop arguments off of “The Stack.”
Variable length return, variable length param list.
●
void Foo( SV* arg1, … )
●
Void return and unfixed number of args: Combine 2nd and 3rd techniques.
25. The First Scenario
●
The first situation passes basic data types (or none), and returns a
single basic data type. Parameter list is fixed length.
use Inline C => 'DATA';
print add_one( 10 ), “n”;
__DATA__
__C__
int add_one( int arg ) {
return arg + 1;
}
26. The Stack
●
●
The stack is Perl's internal means of passing parameters to, and return
values from subroutines.
INLINE.h defines a set of convenient macros used to manipulate the stack.
These are sugar for less convenient XS macros.
Inline_Stack_Vars
Inline_Stack_Items
Inline_Stack_Item(i)
Inline_Stack_Reset
Inline_Stack_Push(sv)
Inline_Stack_Done
Inline_Stack_Return(n)
Inline_Stack_Void
●
Examples are better than definitions...
27. The Second Scenario
●
Returning a list. (C doesn't do this by nature.)
__C__
void Foo( int arg1, char* arg2, SV* arg3 ) {
int i, max;
Inline_Stack_Vars;
Inline_Stack_Reset;
for (i = 0; i < max; i++)
Inline_Stack_Push(newSViv(i));
Inline_Stack_Done;
}
●
There are other macros explained in the cookbook, and
perlguts that create and manage new SV's dynamically too.
28. The Third Scenario
●
Variable length argument list. Returning a single basic type.
●
Variable length lists must pass at least one parameter.
use Inline C => 'DATA';
my $count = greet( qw/ George John Thomas James James / );
print "That's the first $count presidents.n";
__DATA__
__C__
int greet(SV* name1, ...) {
Inline_Stack_Vars;
int i;
for (i = 0; i < Inline_Stack_Items; i++)
printf("Hello %s!n", SvPV(Inline_Stack_Item(i), PL_na));
return i;
}
29. The Third scenario, part 2
●
Variable length arg list, void function (no return value).
●
When manipulating the stack, if there's no return value you
must specify that explicitly.
use Inline C => 'DATA';
greet( qw/ George John Thomas James James / );
__DATA__
__C__
void greet(SV* name1, ...) {
Inline_Stack_Vars;
int i;
for (i = 0; i < Inline_Stack_Items; i++)
printf("Hello %s!n", SvPV(Inline_Stack_Item(i), PL_na));
Inline_Stack_Void;
}
/* Explicitly specify no RV */
30. The Fourth Scenario
●
Variable length argument list, return multiple values.
●
Use “The Stack” both to read the parameters, and to push the
return values.
–
Read items off the stack and optionally modify them
with Inline_Stack_Item().
–
Inline_Stack_Item() is a setter and
getter.
Call Inline_Stack_Reset( ) if you plan to push
to the stack.
● Resets stack pointer to the beginning of the
stack.
Inline_Stack_Push() as needed.
–
Inline_Stack_Done
●
–
31. “...no longer a dark, gray bird, ugly and
disagreeable to look at, but a graceful and
beautiful swan.”
– Hans Christian Andersen
The Ugly Duckling
32. Well, it's still ugly... It's C... and Perl... and Internals ;)
●
When coding for Inline::C (and XS, for that matter), you may access Perl's containers:
SV's, AV's, HV's, etc.
●
C's native data types “do less”, but “do it faster”
●
Perl's containers “do more” but are slower.
●
Weigh the tradeoffs:
●
●
●
●
It's often easier to write generic algorithms that don't care what Perl's SV's
contain, rather than extracting and manipulating the SV's contents.
If you're going to implement the “do more” functionality anyway, just use
the Perl containers unless you need non-Perl portability.
If you're building a structure that will immediately get passed back, build it
with Perl's containers.
Use basic C data types in tight loops or for non-Perl portability.
33. An aside: Duff's Device: A loop...
do {
*to = *from++;
/*
/*
*
*
} while(--count > 0);
count > 0 assumed */
Note that the 'to'
pointer is NOT
incremented */
34. An aside: Duff's Device: A loop...unrolled
send(to, from, count)
register short *to, *from;
register count;
{
register n = (count + 7) / 8;
switch(count % 8) {
case 0: do {
*to = *from++;
case 7:
*to = *from++;
case 6:
*to = *from++;
case 5:
*to = *from++;
case 4:
*to = *from++;
case 3:
*to = *from++;
case 2:
*to = *from++;
case 1:
*to = *from++;
} while(--n > 0);
}
}
35. “If your code is too slow, you must make it faster.
If no better algorithm is available, you must trim
cycles.”
– Tom “Duff's Device” Duff
comp.lang.c, Aug 29, 1988
36. The Sieve of Eratosthenes
●
Problem: Find all primes less then or equal to the integer 'n'
●
One Efficient Method:
●
Sieve of Eratosthenes
–
–
●
The sieve may be implemented as a bit vector for memory
efficiency if the implementation is computationally efficient though
often it's not. Otherwise, a simple array.
–
●
Uses a sieve to flag outcasts and retain candidates.
O(n log log n) time complexity,
A pure function that lends itself well to computational
benchmarking.
We will look at Pure Perl, Inline::C, and Inline::CPP (C++).
37. The Sieve of Eratosthenes (Pure Perl)
sub pure_perl {
my $top = ( $_[0] // $Bench::input ) + 1;
return [] if $top < 2;
my @primes = (1) x $top;
my $i_times_j;
for my $i ( 2 .. sqrt $top ) {
if ( $primes[$i] ) {
for( my $j = $i; ($i_times_j = $i * $j) < $top; $j++ ){
undef $primes[ $i_times_j ];
}
}
}
return [ grep { $primes[$_] } 2 .. $#primes ];
}
38. The Sieve targeting Inline::C (with XS macros)
SV* il_c_eratos_primes_av ( int search_to )
{
AV* av = newAV();
bool* primes = 0;
int i;
if( search_to < 2 ) return newRV_noinc( (SV*) av );
Newxz( primes, search_to + 1 , bool );
if( ! primes ) croak( "Failed to allocate memory.n" );
for( i = 2; i * i <= search_to; i++ )
if( !primes[i] )
{
int j;
for( j=i; j*i <= search_to; j++ ) primes[i*j] = 1;
}
av_push( av, newSViv(2) );
for( i = 3; i <= search_to; i += 2 )
if( primes[i] == 0 ) av_push( av, newSViv( i ) );
Safefree( primes );
return newRV_noinc( (SV*) av );
}
39. The Sieve targeting Inline::CPP
#include <vector>
SV* pure_il_cpp()
{
int search_to = SvIV( get_sv( "Bench::input", 0 ) );
AV* av = newAV();
if( search_to < 2 ) return newRV_noinc( (SV*) av );
std::vector<bool> primes( search_to + 1, 0 );
for( int i = 2; i * i <= search_to; i++ )
if( ! primes[i] )
for( int k, j = i; (k=i*j) <= search_to; j++ )
primes[ k ] = 1;
av_push( av, newSViv(2) );
for( int i = 3; i <= search_to; i+=2 )
if( ! primes[i] ) av_push( av, newSViv(i) );
return newRV_noinc( (SV*) av );
}
40. Off to the races...
●
Benchmarking
●
●
All subs are wrapped in Perl sub-wrappers to bind calling parameters for the
benchmarks.
Benchmark at 2, and 1,000,000 for 10 seconds each.
●
The C versions find 3,000,000 primes in less than 1/20th of a second.
●
C version tested up to 650 million primes (several seconds).
–
●
Beyond that I ran out of memory using a sieve of ints.
C++ version performs well over a billion primes.
–
A bit vector conserves memory.
42. XS and Inline::C; not just speed
●
Unit testing of C code and libraries with Perlish tools.
●
Linking to external C libraries.
●
Extending Perl's interface to its internals
●
●
Scalar::Util, Hash::Util, etc.
Building on the strengths of each language in complex
ways.
–
Extending Perl with C/XS
–
Embedding Perl in C/C++
43. An example from Perl Testing: A Developer's Notebook
BEGIN { chdir 't' if -d 't'; }
use strict;
use warnings;
use Test::More tests => 6;
use Inline C => Config => LIBS => '-lm', ENABLE => 'AUTOWRAP' ;
Inline->import( C => <<END_HEADERS );
double fmax( double, double );
double fmin( double, double );
END_HEADERS
is(
is(
is(
is(
is(
is(
●
fmax( 1.0, 2.0 ), 2.0, 'fmax() should find maximum of two values' );
fmax( -1.0, 1.0 ), 1.0, '... and should handle one negative' );
fmax( -1.0, -7.0 ), -1.0, '... or two negatives' );
fmin( 9.3, 1.7 ), 1.7, 'fmin() should find minimum of two values' );
fmin( 2.0, -1.0 ), -1.0, '... and should handle one negative' );
fmin( -1.0, -6.0 ), -6.0, '... or two negatives' );
Ian Langworth; Chromatic. Perl Testing: A Developer’s Notebook (Kindle Locations 5471-5474).
44. C library functions
●
You may use Inline::C to call functions from C
libraries
●
●
You may need to write Inline:C wrappers, or pure
Perl wrappers.
Inline::C offers AUTOWRAP for trivial
wrappers.
45. Linking to an external library
# From the Inline::C-Cookbook POD:
print get('http://www.axkit.org');
use Inline C => Config => LIBS => '-lghttp';
use Inline C => <<'END_OF_C_CODE';
#include <ghttp.h>
char *get(SV* uri) {
SV* buffer;
ghttp_request* request;
// A data type from libghttp.
buffer = NEWSV(0,0);
request = ghttp_request_new();
ghttp_set_uri(request, SvPV(uri, PL_na)); //function call from libghttp.
ghttp_set_header(request, http_hdr_Connection, "close");
ghttp_prepare(request);
// and a few others...
ghttp_process(request);
sv_catpv(buffer, ghttp_get_body(request));
ghttp_request_destroy(request);
return SvPV(buffer, PL_na);
}
END_OF_C_CODE
46. AUTO_WRAP => 1
●
●
Inline::C can automatically generate the function wrappers to link
functions in a library with Perl
Also from the Inline::C-Cookbook
package MyTerm;
use Inline C => Config =>
ENABLE => AUTOWRAP =>
LIBS => "-lreadline -lncurses -lterminfo -ltermcap ";
use Inline C => q{ char * readline(char *); };
package main;
my $x = MyTerm::readline("xyz: ");
47. Perl's power tools.
●
XS can call Perl subroutines by symbol, or subref.
●
XS can accept subrefs as params, or return subrefs. (Callbacks, Currying)
●
XS functions have access to package globals.
●
XS functions have access to the lexical pad.
●
XS functions can create lexical blocks (create closures, for example).
48. Practical approaches toward objects
●
Build in Perl, optimize in C
●
●
●
●
Build all the methods in Perl
Those object methods that stand out in profiling may be rewritten in
Inline::C.
90% of the benefit of Inline::C with only 10% of the fuss.
Inline::CPP – C++ objects become Perl objects.
●
Public data members get accessors and are exposed to Perl.
●
Public methods are exposed to Perl.
●
Private data members and methods aren't exposed to Perl.
●
C++ constructors, destructors, inheritance, and so on.
49. An object with Inline::CPP
use Inline CPP => <<'END';
int doodle() { return 42; }
class Foo {
public:
Foo() : data(0) { std::cout << "creating a Foo()" << std::endl; }
~Foo()
{ std::cout << "deleting a Foo()" << std::endl; }
int get_data()
{ return data; }
void set_data( int a ) { data = a;
}
private:
};
END
int data;
50. An Object with C++ (How Perl sees it)
sub main::doodle { return 42; }
package main::Foo;
sub new { print "creating a Foo()n"; bless {}, shift }
sub DESTROY { print "deleting a Foo()n" }
sub get_data { my $o=shift; $o->{data} }
sub set_data { my $o=shift; $o->{data} = shift }
51. Distribution
●
An Inline::C dependency
●
●
●
Math::Prime::FastSieve is a proof-of-concept here.
InlineX::C2XS
●
●
C code gets compiled at installation time, so end user never has
a “long wait” at startup.
Converts Inline::C code to an XS module, with no
Inline::C dependency.
Or you can copy the XS output from Inline::C and paste it into an XSmodule framework.
●
List::BinarySearch::XS is an example of this approach.
52. The making of My::Module with an Inline::C
dependency.
●
h2xs -PAXn My::Module
●
Modify the Module.pm file:
package My::Module;
$VERSION = '1.23';
use base 'Exporter';
@EXPORT_OK = qw(cfoo cbar);
use strict;
use Inline C => 'DATA',
VERSION => '1.23',
NAME => 'My::Module';
1;
__DATA__
__C__
// C code here...
●
NAME and VERSION parameters required, and must match $VERSION.
53. The making of My::Module (almost done)
●
In Makefile.PL:
●
Change: use ExtUtils::MakeMaker
●
To: use Inline::MakeMaker
●
perl Makefile.PL
●
make dist
●
Remember: The C code is compiled on install; user never sees a
delay during startup or runtime.
54. The Inline and XS communities
●
inline@lists.perl.org
●
perl-xs@lists.perl.org
55. Other Inline Attractions (continued)
●
Inline::C++
●
I took over maintenance from Neil Watkiss in November 2011 after it
had gone unmaintained since 2003.
–
●
It was only passing on about 10% of CPAN smoke-testers.
Now it's passing about 90% of CPAN smoke-testers.
–
–
●
We still have some challenges.
Contributors (code, FAIL-test-systems, etc.) welcome.
Inline::Python
●
“Inline::Python is extremely important for the company I work for which is
why I can spend quite some time on it. So it's future is very secure :)”
–
Stefan Seivert: Current maintainer, in an email message to the
Inline mailing list on 12-5-11.
57. 'The best compliment I've gotten for CPR is when my
ActiveState coworker Adam Turoff said, ``I feel like my
head has just been wrapped around a brick''.'
– Brian “Ingy” Ingerson
Pathalogically Polluting Perl (perl.com)
Feb 2001
58. The Brick: Inline::CPR
●
Inline C Perl Run
●
Running Perl from a C program (quick and dirty
embedding)
#!/usr/local/bin/cpr
int main(void) {
printf("Hello World, I'm running under Perl version %sn",
CPR_eval("use Config; $Config{version}")
);
return 0;
}
●
I think that's enough on that topic. ;)
59. Conclusions: Why C, Inline::C, and XS?
●
An application needs to get at functionality built into existing C libraries.
●
Windowing and graphic systems.
●
Specialized C libraries.
●
Operating-system libraries.
●
Other “tool reuse.” If a C library exists, why rewrite it in Perl?
●
Getting “Close to the metal.”
●
An application or extension has a bottleneck and no better algorithm exists.
●
Unit testing of C code, using simple Perl tools. Expose C functions to the TAP harness.
–
Inline::C in specific makes these transitions less complex, which facilitates
writing and testing more robust code.
60. “You can sometimes write faster code in C, but you can
always write code faster in Perl. Since you can use each
from the other, just combine their strengths as you see fit.
(And tell the dragons we said "Hi.")”
– Larry Wall; Tom Christiansen; Jon Orwant.
Programming Perl, 3rd Edition
61. Resources:
●
Advanced Perl Programming, 2nd Edition – An excellent introduction to Inline::C and XS.
●
Inline, Inline::C, Inline::C-Cookbook, Inline::Struct, Inline::CPP POD, on CPAN
●
perlguts, perlcall, perlapi – Must read.
●
perlxs, perlxstut, perlmod, perlembed – Unavoidable.
●
Perl Best Practices – Discusses Inline::C as a Best Practice where XS is needed.
●
Perl Cookbook, 2nd Edition – Several Inline::C examples.
●
Programming Perl, 3rd Edition – XS-centric, but still applicable toward learning Inline::C.
●
http://www.perl.com/pub/2001/02/inline.html “Pathologically Polluting Perl”
●
Mailing Lists: inline@perl.org and perl-xs@perl.org.