Granting Plone an API
Thoughts on using Martian (Grok) and other tools to give Plone a more well-defined API.
One of the most common questions heard in the Plone support channels from Python developers that come to Plone for the first time with a desire to customise and extend it, is "where is the API?" This question is pretty difficult to answer with a straight face.
First of all, Plone has a lot of extension points. If you know the system well, you can customise or override virtually any aspect of it. This is really good, but it does mean that a new developer often can't see the wood for the trees, so to speak. There's a fair amount of "bottom-up" learning required in order to understand the fundamental customisation mechanisms (acqusition, adapters, utilities) and best practice, and to realise how the extension points are distributed across the codebase.
Secondly, to revisit the age-old "is Plone a product or is it a framework?" conundrum, consider that Plone core is (or certainly has been) developed in a product-like fashion. The "productness" refers both to the end user experience, where development is normally driven by a desire to introduce useful new features, and the integrator experience, where development is driven by a desire to make things customisable, extensible and re-usable.
The problem is that these two aspects of the "Plone product" are developed as part of one codebase, with no clear separation between the two in terms of code location, documentation or development practices. If Plone had been developed primarily as a framework, then a lot more thought would have gone into "the API", simply because "the API" is the end product of a framework.
Retro-fitting an API
So, does Plone need and API? In my opinion, it does - or at least, it needs to formalise and better expose the API that's already there, in order to make life easier for new (or forgetful) developers.
Having a more stable set of API touch points would allow more bite-sized learning: you can get a lot done with an incomplete understanding of how things work. This in turn leads to more immediate gratification for developers using Plone. A good rule of thumb is that a new developer picking up Plone should feel that they've achieved something in the first hour (say, getting it installed and running), in the first day (say, configuring the system through the GUI or building a simple content type), in the first week (say, delivering a solution that meets one or more important use cases) and in the first month (say, delivering a small project).
If we, the Plone core developers, are to facilitate such bite-sized learning and immediate gratification, we need to identify common use cases and needs, and make explicit steps for those. In part, that's a documentation challenge, but I'd argue that it's also about API design. (I've talked about this before, of course.  )
Where to put it?
We want to make it obvious both where to look ("where is the API?"), and what to do ("what is the API?"). This means that we can't just do more of the same. Yes, we are already quite good at writing interfaces with docstrings and doctests for small packages (doctests are a great way to expose, exercise and document the API). Since Plone 3, we've also been much better at breaking Plone's services into smaller, more digestible packages, rather than the one big Products.CMFPlone package. This helps make exension/customisation more accessible to people who already understand how extension and customisation works in Plone, but it doesn't help those who need a much smaller set of concepts to get started with.
Furthermore, even small, specialised packages invariably contain a lot of implementation detail. For a user of the package, it's not always easy to know how much of the package they need to understand. Packages grow organically over time, acquire some ugliness that's required for backwards compatibility, get re-organised (thus "breaking" documentation) and so on. We could try to introduce more conventions, such as having a '.api' module in each package, but even this gets hard to maintain and, in my own experience, sometimes feels like a bit of a straightjacket when trying to get a piece of functionality working.
I think the difficulty here stems in part from the fact that good API design is both difficult, demanding a lot of experience, and requires a different mindset than regular implementation work. When I've tried to do this in the past, I've found that the APIs I design "bottom-up", as a consequence of some piece of functionality I've written, tend to be cumbersome and inconsistent. To design an API I'm proud of, I need to start in the other end: how would I like this to look and feel?
With this in mind, I've come to like the pattern of having an API package. Let me show what I mean by that with an example.
As it happens, I've been doing this quite recently, in the context of Dexterity. I've blogged about the Dexterity API design before (warning: article is slightly out of date, though mostly representative). In many ways, Dexterity is all about API design. There's not a whole lot of radically new code there. Mostly, it's about stitching together and simplifying things that already exist and making one compelling, consistent story.
We started out having the API part of the plone.dexterity package. You'd import the bits you needed and be done with it. This didn't really work, though, because the code you needed was (correctly) spread across various modules, and it was not so easy to remember what imports to use.
To solve this, we collected useful imports in plone.dexterity's __init__.py file. This meant you didn't need to remember the exact location for most common imports, but it too got out of hand.
For one thing, we wanted to use Martian, the convention-over-configuration framework that underpins Grok, to make some APIs even more natural. However, we did not want to make martian (or the various grokcore.* packages that form the re-usable bits of Grok) an explicit dependency of plone.dexterity. There was simply no need, and we wanted to keep the base package as light as possible.
Secondly, we wanted to augment the API with doctests and examples. It became a bit awkward to keep these inside the base package, because test setup was quite different (the API tests are integration tests, the core tests are unit tests), and we could never find a good place to put the various "API" bits.
Finally, we created two new packages: plone.directives.form and plone.directives.dexterity. Both of these contain a lot of documentation, examples and integration doctests, but no code that directly contributes to functionality - only convenience imports and martian grokkers and directives (more on that later).
The idea is that imports should look natural, and that you should be able to get going simply by reading the documentation in these "integrator-facing" packages without having to dig deeper into the stack. Here is an example:
from five import grok from zope import schema from plone.directives import form, dexterity class IMyPage(form.Schema): summary = schema.Text( title=u"Summary", description=u"Summary of the body", readonly=True ) body = schema.Text( title=u"Body text", required=False, default=u"Body text goes here" ) class MyPage(dexterity.Item): grok.implements(IMyPage) @property def summary(self): if self.body: return "%s..." % self.body[:30] else: return "
Why Martian (and convention-over-configuration)?
The code above uses Martian grokkers and directives to configure the application. There are a few reasons for this.
First of all, there is no need for a separate ZCML file to register components. In terms of making Plone code easy to understand and get started with, this is quite important. Whilst I think XML configuration is good for externalising configuration from code, I don't think we should require people to use it to wire up all their code, all the time. It's probably a good idea to require it for more advanced operations, such as overriding components from other packages, but experience suggests that if day-to-day coding requires too many open files in your editor, then it's easy to get lost. I've seen projects grow to having a configure.zcml that's a thousand lines long and impossible to read, and hundreds of files in a single package, with inconsistent naming to boot.
Secondly, convention-over-configuration is a way to encode "best practice". By steering people down a particular path, we can help them organise their code in a way that is at least predictable, and probably better than what they would've come up with on their own if they lack experience of the system. Don't underestimate this: it makes it easier for a developer to correlate what he's doing with other examples and documentation, and makes the results more predictable should he need help from others.
(As an aside: It would be very wrong to force everyone to follow the same conventions 100% of the time. Martian and Grok do not do this, but they do reward you for following a few conventions by letting you save a bit of time. Contrary to some frameworks, however, your existing code will not need to be rewritten the moment you decide to deviate from the conventions).
Thirdly, the example above, and the martian grokkers (a grokker is the thing that looks for features of the code, e.g. file names or base classes, and performs configuration actions based on how the code is structured) and directives (the hanging configuration "hints" that are applied to classes, e.g. grok.implements()) have been refined to make the code as easy to read, understand and write as possible. IMyPage is a schema, it has two fields, and it's implemented by the MyPage class, which is an Item (as opposed to a Container).
The style of importing top-level configuration packages - "form", "dexterity", "grok" - is also by design. The prefixes should be natural and hint about what the configuration is for. plone.directives is a namespace package, and so we can have more of these as required, corresponding to the common actions: building forms, building content types, providing UI components for theming, etc.
Not all of these things are consequences of using Martian and Grok-like patterns, of course, but they certainly help. The Grok project spends a lot of its time thinking about API design, and is led by some talented, experienced API designers in Martijn Faassen, Philipp von Weitershausen and others. Since Grok uses many of the same Zope 3 components as Plone, and since the Grok, Zope and Plone communities overlap, it is prudent to adopt the relevant conventions for consitency, if nothing else.
A plan, of sorts
Roughly speaking, here's what I'd like to see:
- We identify the core APIs that are used by 80% of third party developers. This should be relatively easy.
- We identify the core problem domains that comprise 80% of what third party developers actually want to do.
- We try to match the two: what "API coverage" do we have?
- We do a bit of "blue sky thinking": if we could choose, how would we like "the API" to look? Here, we can learn a lot from Grok, in my opinion. Grok is also inspired by the DRY (Don't Repeat Yourself) principle, which is very sound, and a desire to keep things as consistent and obvious as possible.
- We carefully design plone.directives.* packages that provide such an API. This is about more than code: the README files (and PyPI front pages) of these packages should be enough to get you going. Perhaps we want to use Sphinx to make packagse more self-documenting, or accompany each package by one or more tutorials on plone.org.
Ideally, finding out how to accomplish a common task with Plone should involve nothing more than finding the right plone.directives.* package and reading its documentation and following its examples (at least once you know how to install Plone using buildout and create a package using paster, which is pretty easy).
Don't try this at home, kids
I think the plan above is imminently achievable, and not really very much work. However, it has one huge risk: we could get it wrong and make a cumbersome, inconsistent or repetitive API. That'd be worse than doing nothing, since it'd only add to the confusion.
For this reason, I'm going to be slightly arrogant: Please, don't release packages in the plone.directives.* namespace willy-nilly, at least not without seeking some input from the community. This is an area where it really pays to have many eyes and seek lots of input. I've been be primary developer behind plone.directives.form and plone.directives.dexterity to date, and it's taken me four or five attempts to get it to a state where I'm pretty happy with it.
By all means, if you have a package that's got some kind of framework aspect, consider using the same patterns and conventions. You may not need a whole separate "API package" if your problem domain is a bit smaller than Plone itself, but having an API sub-module could be advisable if you want people to extend your package easily.
And consider this my vote for making use of Grok-inspired convention-over-configuration at the outer rim of Plone's stack, where it can make an impact in terms of making Plone more accessible to new developers (whether we should adopt grokcore.* and five.grok when writing Plone core components is another matter, for another discussion, but also altogether less urgent).