Pytest-BDD vs. Behave

Love it or loath it, Python is the language of choice for most Data Scientists – so if the work is done in Python and you are writing the requirements in Gherkin, what tool should you use to automate testing?

There is no official version of Cucumber for Python, so the options are Pytest-BDD, Behave, and Lettuce. Let’s eliminate Lettuce right up front; here is Github’s graph of commits for the project:

Lettuce is unfortunately a dead project; if you find a bug there is no community that will fix it for you. The last release was in 2016, so unless someone revives it, this isn’t a viable choice.

Pytest-BDD and Behave are both active, so both are worth considering. What are the differences?

Pytest-BDD

Advantages

Pytest-BDD has the advantage of being integrated with Pytest; your developers should find that part of it easy to work with. It can use Pytest fixtures, mocks, etc.

Disadvantages
 

There is no support for data tables – this is a HUGE problem. Data tables are used in many, many scenarios, like this one for a library catalog search:

Scenario: The catalog can be searched by author’s name; both first and last names are searched.
 
Given these books in the catalog
Author Title
Stephen King The Shining
James Baldwin If Beale Street Could Talk
When a name search is performed for Stephen, then only these books will be returned
Author Title
Stephen King The Shining

The ‘Given’ in this scenario uses a data table to specify the contents of the catalog, and the ‘Then’ uses a data table to show which titles will be returned by the search. How could you possibly write this scenario without data tables?

Some people will substitute a Pytest fixture for the data tables, but this defeats the primary reason for using SBE. The primary reason is not to automate testing, it is to make it incredibly easy of the Product Owner, Business Analyst, Developer, and QA to communicate clearly. Hiding the examples in a fixture destroys the expressiveness of the scenario – what would the Product Owner say if you asked them whether this requirement is correct?
 
Scenario: The catalog can be searched by author’s name; both first and last names are searched.
 

Given the books from Fixture1 in the catalog

When a name search is performed for Stephen

Then only the books from Fixture2 will be returned

This requirement does not facilitate communication; it isn’t Specification by Example because the examples are hidden in the fixtures!

It is possible to work around this by modifying the regex in the step definition so that the regex includes the data table, but that is rather clumsy.

Behave
 

Advantages

Behave supports most of the Gherkin syntax; in particular it does support data tables (even though it thinks that all data tables have a title row).

Disadvantages
 

Behave doesn’t have a good parser; for example, it will be confused by this scenario title:

Scenario: The catalog can be searched by author’s name; both first is searched first,
and last names are searched last.
 
1) The second line begins with ‘and last names…’; Behave will think this is a Gherkin ‘And’ and will complain that there was no ‘Given’.
 
2) Behave doesn’t raise an error if your scenario outline has a <parameter> with no matching column title in the Examples table; your scenario simply fails.
 
3) Behave doesn’t support some of the newer features of Gherkin, like ‘Rule’.
 
Conclusions
 
Overall, Behave is the better tool because it supports data tables – in my experience, easily more than half of the (several thousand) requirements I have written used data tables.
 
If your developers are making extensive use of Pytest mocks, you may be better off using Pytest-BDD. If you take that approach, DO NOT fall into the trap of using fixtures in place of data tables; you will lose all of the value of Specification by Example. Remember what Aslak once wrote in a blog post: “If you think Cucumber is a testing tool, please read on, because you are wrong.”

Leave a Comment