Choosing a software package for a particular purpose involves evaluating
several differentiating factors; these factors include the functionality of a
package, the performance of a package, the user-friendliness, and even the
ability of an individual to find help, engage with others and feel a sense of
participation. cite something here The development, fostering and
design of the community around yt
is deemed to be both crucial to the
success or failure of yt
, and in many ways inseparable from its
functionality.
There are several rough categories of individuals engaged in development and
utilization of yt
. As a result of its API-first design, there are few if
any individuals who use yt
that do not do so through the scripting
interface; this means that the vast (if not exclusive) majority of individuals
who interact with the functionality in yt
are doing so by writing their own
scripts, modules, and code, and arguably engaging in a value-added development
process of their own. The majority of individuals using yt
at present are
in astronomy and astrophysics, typically fields of simulation, although there
is an increasing group of individuals from other domains that are participating
in development and using yt
for their own domain-specific problems.
Making the distinction somewhat more clearly, there are individuals who have
built their own scripts and utilized them as well as individuals who have
contributed changes or modules to the primary yt
codebase. In addition,
there is an emerging set of projects that build on yt
as infrastructure to
conduct scientific analysis. These developers are largely driven by their own
pragmatic scientific needs, and they constitute the majority of developers (by
number) that contribute to the code base. The majority of these individuals
are early- to mid-career researchers, typically graduate students, postdocs,
and assistant professors.
In recent years, there has emerged a more coherent contingency of individuals
who participate in both pragmatically-focused development of modules and
functionality for their own benefit as well as modules or overall improvement
that is supplemental or even external to their own research agenda. These
improvements include improvements to the unit handling, to the plotting code,
to infrastructure for loading disparate datasets, and so on. At this time we
do not know of any individuals funded to work on yt
completely independent
of a scientific or scholarly goal.
The composition of the community, particularly with a mixture of timelines for goal-setting and completion, can at times cause frustrations and difficulties. For instance, the response to "Can this feature be implemented?" often includes an invitation for the questioner to collaborate on developing that feature and submitting it to the codebase. Developing a schedule of releases is an act of consensus building, both deciding what bugs are critical to fix in the timeline of a release as well as building consensus on what features should be considered blockers for a new release. The intersection of this with academic deadlines (for instance job application season) requires balance and care.
When evaluating the level of engagement, we consider a few different classifications of tasks that are performed by individuals in the community, and evaluate these based on how they flow into greater engagement.
- Filing issues
- Participating in mailing list discussions
- Issuing a pull request
- Writing documentation
- Participating in code review
- Drafting an enhancement proposal
- Closing bug reports
While there are other activities that individuals can participate in, these are the typical activities we see among participants in the community. The order, flowing from the first to the last, is the typical flow we see for an individual coming to participate in the community. The first step is typically to file an issue or bug report (occasionally these are requests for new features), followed by partipating in development-focused discussion on mailing lists. The next level of engagement typically involves the development of a new piece of functionality, refinement of existing code, or issuing a fix for a bug or issue. These take the form of pull requests (described in greater detail here) that can be reviewed and added to the code base.
The next level of engagement centers around tasks that are not fully-aligned with pragmatic, code-driven scientific inquiry. The development of documentation is often viewed as orthogonal to the scientific process, and typically requires an iterative wrriting process. Participation in code review, providing comments, feedback and suggestions to other authors, is another somewhat orthogonal task; it doesn't necessarily directly benefit the developer doing the reviewing (although it might) and it does not necessarily result in academic rewards (citations, authorship, etc). But, it does arise from a pragmatic (ensuring code reliability) or altruistic (the public good of the software) motivation, and is thus a deeper level of engagement.
The final two activities, drafting enhancement proposals and closing bug
reports, are the most engaged, and often the most removed from the academic
motivation structure. Developing an enhancement proposal for
yt
means iterating with other developers on the motivation behind and
implementation of a large piece of functionality; it requires both motivation
to engage with the community and the patience to build consensus amongst
stakeholders. Closing bug reports -- and the development work associated with
identifying, tracking and fixing bugs -- requires patience and often repeated
engagement with stakeholders.
We include here plots of the level of engagement on mailing list discussions and the citation count of the original method paper.