The document discusses how anyone can contribute to the Apache Solr project regardless of their technical skills, by participating in the community through documentation contributions, reporting bugs, testing patches, and providing feedback on proposed features. It also provides an overview of the technical steps for building Solr from source code and working with the project's Jira issue tracker and patch submission process. The goal is to encourage more community involvement and make the contribution process less intimidating for newcomers.
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
How To Be a Solr Contributor - Chris Hostetter, Lucidworks
1. How To Be a Solr
Contributor
Chris "Hoss" Hostetter - 2018-10-18
https://home.apache.org/~hossman/activate2018/
https://twitter.com/_hossman
https://www.lucidworks.com/
Session Abstract:
Every bug fix starts with a bug report.
Every feature starts with an idea.
Every line of code, every page of documentation, every automated test case --
they all exist because of communication & collaboration.
In this session, we'll discuss the ways in which all Solr users (regardless of java
know how) can make meaningful contributions to Solr: Helping to diagnose & fix
bugs; improving documentation; designing & implementing new features; etc....
2. Software Developer
~21 years working on "Search"
software
~14 years working on Lucene/Solr
Employed by Lucidworks to "Make
Solr better"
Getting started as a Solr
Contributor
Growing as a Contributor
Thinking like a Committer
Non-technical discussion
No Java!
Who & What
https://home.apache.org/~hossman/activate2018/
3. Anyone Can Contribute To Solr
There are so many ways to contribute to open source
if we look past the obvious steps of writing a new
product feature. Everyone who uses open source can
bring their skills to the community and help keep
open source a vital part of computing.
— Andy Lester (Blog Post)
Citation: 14 Ways to Contribute to Open Source without Being a Programming
Genius or a Rock Star
Andy also published another blog along the same lines more recently: The
Beginner’s Guide to Contributing to Open Source Projects
When I was first learning to program, Andy was a prolific mentor in the Perl
Usenet newsgroups that I was able to learn a lot from. Decades later, Andy
popped up in the #solr IRC channel and I was able to help him learn about Solr.
Different people have different skills and different experiences, all of which can
be useful contributions to open source communities & code bases like
Lucene/Solr.
4. Why Contribute?
Companies that contribute and give back learn how
to better use the open source software in their own
environment
— Frank Nagle (Harvard Business School Study)
Citation: The Hidden Benefit of Giving Back to Open Source Software
Beyond the obvious value to an open source project when people make
contributions, there is value gained from the experience of contributing, and how
it helps increase your understanding of the software and how to use it. This
experience and increased understanding doesn't just help you as an individual,
but can also benefit your co-workers and company as a whole.
Frank Nagle, Assistant Professor Harvard Business School, recently did formal
research on this, which can prove to be a helpful citation if/when your boss asks
why you want to help contribute back to Solr (or any open source project) on
company time.
Nagle, Frank. "Learning by Contributing: Gaining Competitive Advantage
Through Contribution to Crowdsourced Public Goods." Organization Science 29,
no. 4 (July–August 2018): 569–587.
6. Mailing lists (User & Dev)
IRC
Stack Overflow
Blogs & social media
Meet-Ups, conferences,
hackathons
Test nightly builds, new releases,
new features
Ask / answer questions
Share links to docs, bug reports,
past discussions
Discuss your personal experiences
Participate
The easiest way to contribute to Solr is to participate in the user community.
Listening to the experiences of others, and sharing your own experiences not
only helps make the community stronger, but it helps everyone involved
understand Solr with greater breadth & depth (empowering further
contributions).
A big part of community collaboration is listening & replying to questions. Even
when you don't have first hand experience with a feature/bug someone
describes, You can still help by sharing links to documentation or past discussions
you've seen on that topic.
I started trying to answer 1 question a day on the user's list. "Doing
well by doing good", since that required that I look at some parts of
Lucene that I didn't already know about which grew my own skill set
while helping others get into Lucene as well.
— Erick Erickson (Lucene/Solr Committer)
But community isn't just about Q&A -- it's also about discussing what works well,
and what could be improved; What use cases might be made easier with
new/modified features. These kinds of questions & discussions are all valuable
contributions at the community level that can build into documentation & code
contributions -- even if you don't personally know how to do them yourself.
7. X+ =
X+ =
?!?!?
+X+ =
◔+X+ =
+X+ =
+X+ =?
≠ ?
◭+X+ =
≠ ≠ ≠◮ !
Sharing the details of your experiences can be helpful when answering questions,
or diagnosing bugs, even if your use-case / situation is not exactly the same as a
person asking questions.
As an example of how a shared user experience can be helpful: If a new user says
"I can't get X to work" just sharing that you have seen X work can help them
realize that the problem may be more nuanced then they realized, there may be
other details of their situation/setup that are contributing to the problem.
Sharing details of how you use "◔ + X" may help them realize how their setup
differs from yours (" + X").
Even with out knowing a lot about " " you might be able to experiment with it,
and try out what the other user is describing, to let them know if you can
reproduce their problem, or point out additional nuances in their configuration
they may not have considered.
Ultimately it may just turn out that the other user didn't understand some
subtleties of the situation, and your contribution helped them see that. Or maybe
there is a gap in the documentation, or a bug in the code that only manifests itself
in specific circumstances -- either way your contributions to the discussion may
help identify the root of the problem.
8. Overlapping Experiences
Developer
Novice
User
Feature X
New
User
?
Even if you don't consider yourself a "Solr Expert", and even if you are only
familiar with a small subset of Solr functionality, your personal experience
dealing with the features you do use can be invaluable. Solr Developers may have
a broad and deep amount of knowledge, but that doesn't mean they know
everything there is to know about every possible feature and how it behalves in
every possible situation...
10. Documentation
Contributions to the community, while valuable, are largely ephemeral. The next
step to take in contributing to an open source project is to help ensure that the
knowledge shared in the community is preserved and easy to find in the future.
For Solr, the official user documentation is the Solr Reference Guide. Equally
important, but targeted to developers working to improve Solr, are the
Lucene/Solr Javadocs (Automatically generated from the source code
comments).
11. Docs = Project Gateway
Documentation is also the entry point into a project
for most people, and a lack of good documentation
inherently limits its reach. By contributing to
documentation, you can help attract new users (and
potentially new contributors) to a project.
— James Turnbull (Increment Magazine)
Citation: Documentation as a gateway to open source
12. Good Docs = Hard
Comprehensive
Accurate
Useful
Current
When new features are added, it can be very easy for Developers to write
"accurate" documentation ("The X option does XXX, the Y option does YYY, ...")
but a broad range of perspectives are crucial to ensuring that the documentation
is "useful" to a broad range of users. On large projects like Solr, it can also be hard
for developers working on individual features to ensure that the documentation
is "comprehensive" because of all the ways features might interact with each
other, and even how a single feature might be discussed in different ways in
different parts of the documentation. "Many Eyes" with many different levels of
experience can help ensure that documentation like the Solr Reference Guide
stays accurate, useful, comprehensive and "current" as time goes on.
Solr has a lot of cool features but most new users get lost in them. So,
all of my presentations and most of my contributions are about making
that onboarding experience easier. I especially like to connect different
parts of Solr together and show how something works end-to-end.
— Alexandre Rafalovitch (Lucene/Solr Committer)
13. Good Intro Docs = Harder
Depth of Knowledge
Experience
Seasoned
Developer
Novice
Developer
Seasoned
User
New
User
Novice
User
Novice
User
When writing docs directed at "New Users", the various perspectives of "Novice"
users are extremely valuable because their perspectives have a lot in common
due to similar frames of reference.
While seasoned & experienced users and developers may have a broader (or
deeper) understanding of how Solr works, or the specifics of particular features;
that breadth & depth of understanding can be a hindrance to writing useful
documentation for new users, because of the disconnect in
experience/perspective when dealing with the software.
14. Contributing To Docs
Read, review, & critique
Share what you find confusing from your perspective
Propose alternative structure / content
Offer additional content / examples
Suggest new documentation / examples for proposed features
Help developers think about how functionality may be perceived before
API/UX solidifies
Contributing content can be as simple as a comment on the mailing list, or in
issue tracker
Or dive in and submit a documentation patch?
When I was getting started as a contributor, I thought that I'd learn the
most if I followed the lists exhaustively. [...] it's good to keep up on what
the community is doing. But if the goal is learning, then the Ref Guide
or just getting your hands dirty are often more efficient ways to start.
[...]
Find a feature you're interested in. If there are examples, try them out
and correct the docs where they need clarified. If there aren't examples
try writing your own.
— Jason Gerlowski (Lucene/Solr Committer)
16. Building Solr From Source
apt-get install openjdk-8-jdk git ant
git clean -fdx && git reset --hard && git pull
git apply < /tmp/SOLR-ABC.patch
cd solr && ant documentation create-package
git diff > /tmp/SOLR-XYZ.patch
https://wiki.apache.org/solr/HowToContribute
The first snippet of shell commands show the basic steps needed (on a Debian
Linux system) to install the core tools you'll need to build Solr from source. These
commands will be slightly different on Mac or Windows machines, but should be
very straight forward. You'll only need to do this once on any given machine
where you wish to work.
Regardless of your OS, the second snippet of shell commands show the basic
steps to create a local mirror of the Lucene/Solr source code repository on your
system and "bootstrap" the build process. These steps will only be needed once
on any given machine where you wish to work.
Regardless of your OS, the third snippet of shell commands show some basic
commands you might want to run anytime you want to help test a patch, or
create your own patch with suggested changes.
There is a LOT of additional information about working with the Lucene/Solr GIT
repo and working with patch files on the "How To Contribute?" Wiki page --
including a lot of details on: working with github, working with the Lucene/Solr
code in common Java IDEs, and how our test framework works for running
randomized (but reproducible) automated tests -- if you are comfortable doing
those things.
git clone http://git-wip-us.apache.org/repos/asf/lucene-solr.g
cd lucene-solr && ant ivy-bootstrap
18. Working With
Jira & Patches
Jira is the issue tracker used for Lucene/Solr to track Bug Reports, Feature
Requests, etc....
Patch Files are an efficient way of capturing & exchanging suggested changes to a
code base. In the Lucene/Solr project, contributors are encouraged to upload
patch files to Jira as attachments to relevant issues.
19. Review Bug Reports
Ask questions
Try to reproduce
Suggest & test workarounds
Provide alternative/counter examples
Propose documentation clarifications
Point out similar/related issues
Test proposed patches
What additional information would be helpful to a developer trying to make sense of this
Bug Report?
Try to reproduce
Confirm initial reports
Elaborate on vague reports
Summarize simplest precondition
Provide alternative/counter examples
When else does "It" fail?
When does "It" work?
Test proposed patches
Does the bug go away?
Does the fix have unexpected side effects?
20. Discuss Feature Proposals
Ask questions
Clarify objectives
Give feedback on user experience
Suggest documentation changes/additions
Think holistically
Test proposed patches
What information would help the project maintainers decide if/how/why to add this
functionality?
Clarify objectives
Are the use cases clear
Are there similar use cases that seem overlooked?
Are there use-cases that would be harmed by these changes?
Give feedback on User Experience (UX)
Is the UX consistent with similar existing functionality?
Is the new functionality appropriately general?
Test proposed patches
Does "It" work as described?
Does "It" have unexpected side effects?
Does the patch break any (un)related functionality?
Even long time contributors and experienced project developers wish more
contributors would help review their proposals & patches...
I wish I received more code review feedback...
— David Smiley (Lucene/Solr Committer)
21. Think Like A Committer
Every feature has a cost that goes beyond the initial
implementation: maintenance cost, documentation
cost, and cognitive cost for your users.
...
... literally adding what the user requested is often
not the optimal choice. Users are focused on their
own specific use case, and you must counter this
with a holistic and principled vision of the whole
project.
— François Chollet (Blog Post)
Citation: Notes to myself on software engineering
22. Submitting Your First Patch
Documentation
Error messages
Code comments
New tests
Fixing bugs
Adding features
Put yourself in the shoes of other Contributors: What questions would you ask if someone
else posted this patch?
23. When you're first starting out, the best thing to do is to start small and with
changes you are comfortable with.
The community can be intimidating. There are some extremely clever
people in it, and I can never be as clever as them. But I can still help to
make improvements to stuff that they aren't so interested in, e.g.
startup scripts, or the UI, etc, stuff that doesn't need a degree in
obscurity to handle
— Upayavira (Lucene/Solr Committer)
Documentation edits are a great entry point because you don't have to know any
Java code at all. Building up from there, improvements to error messages and
comments can be a great next step as you learn more about how the code works...
If you want a low-risk entry point into the code itself, then the strings,
error messages, help text, and other non-code language elements in a
project are ideal.
— James Turnbull (Increment Magazine)
If you do understand java code, improving/adding test cases is a great low-risk
way to improve the code base, and help you verify you understand what the code
is doing.
Whatever your comfort zone, all of these types of changes can not only be made
in new patches/Jiras you create to "scratch your own itch", but it can also be very
valuable to collaborate with another contributor on existing issues/patches.
For example: Another contributor may have submitted a patch with a code
change, and you can post a followup patch that includes their changes as well as
documentation edits to explain how the new functionality works.
24. Yonik's Law Of Patches
A half-baked patch in Jira, with no documentation,
no tests and no backwards compatibility is better
than no patch at all.
— Yonik Seeley (Paraphrased Aphorism)
Citation: Yonik’s Law of Half Baked Patches
Although this is not exactly what Yonik originally said it's commonly referred to
in the Apache community as "Yonik's Law of (Half Baked) Patches"
The bottom line is that "Perfection is the Enemy Of Progress" -- Don't be afraid to
post incomplete works in progress as a starting point for discussion. Early
feedback may help shape/stimulate the direction you take improving a patch, and
save you time from going in the wrong direction. Ideally other people may help
you by iterating with their own additions to your initial effort, sharing the
workload.
25. Q & A
Ask questions if you don't understand. No one will
think less of you for it. They're just happy you're here
helping.
The smartest, most active committers are often the
first to be clear about what they don't understand.
That's how they get so much done. Follow their
example.
— Jason Gerlowski (Lucene/Solr Committer)
My thanks to Jason and several other committers (some of which I've quoted
directly in earlier notes) that graciously shared their thoughts on "Being a
Contributor" in response to a questionaire I sent them when I was preparing this
material.