To me, fixtures are a code smell. If you need so much common setup to test your application, the code under testing is doing too much. It's unfortunately quite common in Rails or Django projects. You need to pass the Foo model to your function, but it will lookup foo.bar.baz, so you need to wire up these as well, which again need further models. Of course everything also talks with the database.
Instead, if you're able to decouple the ORM from your application, with a separate layer, and instead pass plain objects around (not fat db backed models), one is much freer to write code that's "pure". This input gives that output. For tests like these one only needs to create whatever data structure the function desires, and then verify the output. Worst case verify that it called some mocks with x,y,z.
radanskoric 1 days ago [-]
In theory you're 100% right, a true unit test is completely isolated from the rest of the system and than a lot of the problems disappear.
In reality, that is also not free. It imposes some restrictions on the code. Sometimes being pragmatic, backing off from the ideal leads to faster development and quicker deliver of value to the users. Rails is big on these pragmatic tradeoffs. The important thing is that we know when and why we're making the tradeoff.
Usually I go with Rails defaults and usually it's not a problem. Sometimes, when the code is especially complex and perhaps on the critical path, I turn up the purity dial and go down the road you describe exactly for the benefits you describe.
But when I decide that sticking to the defaults the is right tradeoff I want to get the most of it and use Fixtures (or Factories) in the optimal way.
axelthegerman 1 days ago [-]
Right let's add more layers that always solves everything.
No language or abstraction is perfect but if someone prefers pure functional coding, Rails and Django are just not it, don't try to make them. Others like em just as they are
antonymoose 1 days ago [-]
Three clean and simple layers that dovetail are better than one ball-of-yarn God class with too many dependencies.
jstanley 1 days ago [-]
You're more likely to get a ball-of-yarn if you try to separate things into unnatural layers than if you let it be a single layer.
antonymoose 1 days ago [-]
If you have to pass data up and down, back-and-forth the chain and rely on mutation, sure, yes. That’s ball-of-yarn in a nutshell. There is such a thing as reducto ad absurdum in software.
Nevertheless I’ve found far more God classes that could be refactored into clean layers than the other way around. Specifically in the context of Rails style web app as GP is specially discussing. Batteries included doesn’t necessarily require large tangled God classes. One can just as well compose a series of layers into a strong default implementation that wraps complex behavior while allowing one to bail-out and recompose with necessary overrides, for example reasonable mocks in a test context.
Of course this could then allow one to isolate and test individual units easily, and circle back with an integration test of the overall component.
onionisafruit 1 days ago [-]
Fixtures are great for integration tests. But I agree that unit tests needing fixtures indicates a design issue.
Still, most of us work on code bases
with design issues either of our own making or somebody else’s.
matsemann 1 hours ago [-]
Yup, so I'm not against fixtures per se, they have their uses and can be a pragmatic choice. I just often don't like when I have to use them, as it's often to patch over something else. But things are never perfect.
bluGill 1 days ago [-]
I disagree. Prefer integration tests to unit tests where ever possible. If your tests run fast - which most integration tests should be able to do; and you are running your tests often - there is no downside. Your tests run fast and since you run them often you always know what broke: the last thing you changed.
Fixtures done right ensure that everyone starts with a good standard setup. The question is WHAT state the fixture setups. I have a fixture that setups a temporary data directory with nothing in it - you can setup your state, but everything will read from that temporary data directory.
Unit tests do have a place, but most of us are not writing code that has a strong well defined interface that we can't change. As such they don't add much value since changes to the code also imply changes to the code that uses them. When some algorithm is used in a lot of places unit tests it well - you wouldn't dare change it anyway, but when the algorithm is specific to the one place that calls it then there is no point in a separate test for it even though you could. (there is a lot of grey area in the middle where you may do a few unit tests but trust the comprehensive integration tests)
> Worst case verify that it called some mocks with x,y,z.
That is the worst case to avoid if at all possible (sometimes it isn't) that a function is called is an implementation details. Nobody cares. I've seen too many tests fail because I decided to change a function signature and now there is a new parameter A that every test needs to be updated to expect. Sometimes this is your only choice, but mock heavy tests are a smell in general and that is really what I'm against. Don't test implementation details, test what the customers care about is my point, and everything else follows from that (and where you have a different way that follows from that it may be a good think I want to know about!)
matsemann 1 hours ago [-]
I guess it depends a bit on what you work on. Lately I'm working on algorithm heavy stuff, then testing input=>output is much more valuable than if things run, easier to consider edge cases when calling it directly etc. But if you make a crud app it's often more useful to test a flow. So depends.
As for mocks I don't disagree, hence calling it worst case.
What often works for me is separating the code. For instance if I call a function that first queries the db and then marshall that data into something, it's often easier to test it by splitting it. One function that queries, that one can test with some db fixtures or other setup. And then another that gets a model in and only does the pure logic and returns the result. Can then be tested separately. And then a third function which is the new one, that just calls the first and pass the result into the second. Can be boilerplaty, so again, depends.
abhashanand1501 1 days ago [-]
You should look into factory boy (in django). Been using it for 10 years. It helps with this situation.
Foofactory() will automatically setup all the foreign key dependencies.
It can also generate fuzzy data, although having fuzzy data has its own issues in terms of brittle tests (if not done correctly).
orwin 1 days ago [-]
People really use fixture to simulate internal code? i thought it was overwhelmingly used to simulate external API response, or weird libraries that need some context switching (and in that case, an advice: the NIH syndrome is _very_ valid, and sometime the library you use isn't worth the time you put "fixing" it: just rewrite the damn thing)
[edit] though in my case we have one fixture that load a json representation of our dev dynamodb into moto, and thus we mock internal data, but this data is still read through our data models, it doesn't really replace internal code, only internal "mechanics"
bluGill 1 days ago [-]
Simulating an external API is the responsibility of a test double of some sort, not a fixture. Fixtures often setup the test doubles with test data, but they are not the test double. Fixtures can setup other things as well (the line between factories and fixtures is blurry)
swader999 1 days ago [-]
I find inheritance in tests leads quickly to hell. Striving for every last bit of reuse seems like the right thing to do but it hurts in subtle ways that compound over time. If you must, use composition and spend the time on a DSL that clearly documents the setup in each test.
ozim 1 days ago [-]
Inheritance everywhere leads to hell.
As much in applications code it is easy to curb, for test code it is just really hard to get people to understand all this duplication that should be there in tests is GOOD.
disgruntledphd2 1 days ago [-]
As always, there's a tradeoff. I used to go for doing all setup in each test for clarity, but one of my co-workers eventually convinced me that doing this in a fixture is better.
There'll always be some duplication, but too much makes it harder to see the important stuff in a test.
bluGill 1 days ago [-]
It depends on how much setup is done, and where it is. 10 tests that share a setup fixture are good. 100,000 starts to get unmaintainable.
I have lots of test fixtures each responsible for about 10 tests. It is very common to have 10-20 tests that share a startup configuration and then adjust it in various ways.
ozim 14 hours ago [-]
I guess one level of inheritance is bearable, downside is once you start, there will be people coming in later adding more.
radanskoric 1 days ago [-]
When you say inheritance do you mean DRY as in "Don't repeat yourself"?
I'm not sure what you mean by inheritance in tests but DRY is criminally overused in tests. That could be a whole separate article but the tradeoffs are very different between test and app code and repetition in the test code is much less problematic and sometimes even desirable.
swader999 24 hours ago [-]
Both actually. But having to open up three files to figure out how this thing is setup and then override setup to change it slightly in my one case. You get the idea. A really good DSL can help in the areas where creating the SUT is very complex.
dkarl 1 days ago [-]
I feel like the elephant in the room in this post is property-based testing. I dislike using fixtures for all the reasons stated in the post, and when it seems like I might need really them, I reach for property-based testing instead.
"Generators" for property-based testing might be similar to what the author is calling "factories." Generators create values of a given type, sometimes with particular properties, and can be combined to create generators of other types. (The terminology varies from one library to another. Different libraries use the terms "generators," "arbitraries," and "strategies" in slightly different and overlapping ways.)
For example, if you have a generator for strings and a generator for non-negative integers, it's trivial to create a generator for a type Person(name, age).
Generators can also be filtered. For example, if you have a generator for Account instances, and you need active Account instances in your test, you can apply a filter to the base generator to select only the instances where _.isActive is true.
Once you have a base generator for each type you need in your tests, the individual tests become clear and succinct. There is a learning curve for working with generators, but as a rule, the test code is very easy to read, even if it's tricky to write at first.
radanskoric 1 days ago [-]
Author here. Yes, what you describe sound where much like what I call Factories (and that's what they're usually called in Ruby land, and some other languages).
The problem arises when they're used to generate Database records, which is a common approach in Rails applications. Because you're generating a lot of them you end up putting a lot more load on the test database which slows down the whole test suite considerably.
If you use them to generate purely in memory objects, this problem goes away and then I also prefer to use factories (or generators, as you describe them).
Ah, ok, now I understand. Ok, I wasn't talking about that. From what I understand about property based testing it's sort of half way between regular example based testing and formal proofs: It tries to prove a statement but instead of a symbolic proof it does it stohastically via a set of examples?
Unfortunately, I'm not aware of a good property based testing library in Ruby, although it would be useful to have one.
Even so I'm guessing that property based testing in practice would be too resource intensive to test the entire application with it? You'd probably only test critical domain logic components and use regular example tests for the rest.
dkarl 1 days ago [-]
Oh, that's a very different set of requirements than I was thinking, and I missed that context even though you did mention database testing at one point. You're right, property-based testing is less helpful in that situation, because your database may contain legacy data that your current application code must be able to read but also shouldn't be able to write.
strehldev 1 days ago [-]
This is basically how I solved this in a past codebase. I called them "builders" and for complex scenarios requiring multiple different entities I called them "scenario builders" that created multiple entities.
My rule was to randomize every property by default. The test needs to specify which property needs to have a certain value. E.g. set the address if you're testing something about the address.
So it was immediately obvious which properties a test relied on.
dkarl 1 days ago [-]
You should see if your language has a property-based testing library; it'll have a ton of useful functionality to help with what you're already doing!
A clarification on terminology, the "property" in "property-based testing" refers to properties that code under test is supposed to obey. For example, in the author's Example 2, the collection being sorted is the property that the test is checking.
FuckButtons 1 days ago [-]
Forgive me if I’m just reading this incorrectly, but that doesn’t sound exactly like property testing as I’ve done it. the libraries implement an algorithm for narrowing down to the simplest reproducer for a given failure mode, so all of the inputs to a test that are randomized are provided by the library.
japhyr 1 days ago [-]
How did you deal with reproducibility when your tests use randomized data? Do you run with a random seed or something, so you can reproduce failures when they come up?
dkarl 1 days ago [-]
ScalaCheck includes the random seed in its failure messages, so it's easy to pull the seed out of CI/CD logs and reproduce the failure deterministically.
RHSeeger 1 days ago [-]
Example 1 bothers me. It says
> This test has just made it impossible to introduce another active project without breaking it, even if the scope was not actually broken. Add a new variant of an active project for an unrelated test and now you have to also update this test.
And then goes on to test that the known active projects are indeed included in what the call to Project.active returns.
However, that doesn't test that "active scope returns active projects". Rather, it tests that
- active scope returns _at least some of the_ active projects, and
And it does not test that
- active scope returns _all_ of the active projects
- active scope does not return non-active projects
Which, admittedly, is only different because the original statement is ambiguous. But the difference is that the test will pass if it returns non-active projects, too; which probably is not the expected behavior.
I prefer to set things up so that my test fixtures (test data) are created as close to the test as possible, and then test it in the way the article is saying is wrong (in some cases)... ie, test that the call to Project.active returns _only_ those projects that should be active.
Another option would be to have 3 different tests that test all those things, but the second one (_all_ of the active projects) is going to fail if the text fixture changes to include more active projects.
jon-wood 1 days ago [-]
Strongly agree with that. Its slower, but I will always prefer building actual database records as either part of the test or in the test context rather than relying on some predefined fixtures. That makes the test behaviour clearer, and it means you don't have a bunch of unrelated tests failing because someone changed a fixture to accommodate a new test.
radanskoric 1 days ago [-]
Author here. Thanks for writing up your thoughts on this!
The "doesn't include non-active projects objections is easy", please check the Example 1 test again, there's a line for that:
Hm, if you missed it, perhaps I should have emphasised this part more, maybe add a blank line before it ...
Regarding the fact that the test does not check that the scope returns "all" active projects, that's a bit more complex to address but let me let tell you how I'm thinking about it:
The point of tests is to validate expected behaviours and prevent regressions (i.e. breaking old behaviour when introducing new features). It is impossible for tests to do this 100%. E.g. even if you test that the scope returns all active projects present in the fixtures that doesn't guarantee that the scope always returns all active projects for any possible list of active projects. If you want 100% validation your only choice is to turn to formal proof methods but that's whole different topic.
You could always add more active project examples. When you write a test that is checking that "Active projects A,B and C" are returned that is the same test as if your fixtures contained ONLY active projects A,B and C and then you tested that all of them are returned. In either case it is up to you to make sure that the projects are representative.
So by rewriting the test to check:
1. These example projects are included.
2. These other example projects are excluded.
You can write a test that is equally powerful as if you restricted your fixtures just to those example projects and then made an absolute comparison. You're not loosing any testing power. Expect you're making the test easier to maintain.
Does that make sense? Let me know which part is still confusing and I'll try to rephrase the explanation.
RHSeeger 1 days ago [-]
I want to start by saying that I agree with what you're trying to accomplish here. And I agree with some of the ways you go about it. I'm trying to find the right words to covey what I mean here, but... the best I can come with is... what I'm saying here isn't "you're wrong because", it's "what you're doing seems to miss some situations; here's what I do that helps for those".
> The "doesn't include non-active projects objections is easy", please check the Example 1 test again, there's a line for that:
You're correct; I totally missed that.
> In either case it is up to you to make sure that the projects are representative.
That's fair, but that's also the point you're trying to address / make more robust by how you're trying to write tests (what the article is about). Specifically
- The article is about: How to make sure you're tests are robust against test fixtures changing
- That comment says: It's up to you to make sure your test fixtures don't change in a way that breaks your tests
> You can write a test that is equally powerful as if you restricted your fixtures just to those example projects and then made an absolute comparison. You're not loosing any testing power. Expect you're making the test easier to maintain.
By restricting your fixtures to just the projects (that are relevant to the test), you're making _the tests_ easier to maintain; not just the one test but the test harness as a whole. What I mean is that you're reducing "action at a distance". When you modify the data for your test, you don't need to worry about what other tests, somewhere else, might also be impacted.
Plus you do gain testing power, because you can test more things. For example, you can confirm it returns _every_ active project.
All that being said, what I'm talking about relies on creating the test data local to the tests. And doing that has a cost (time, generally). So there's a tradeoff there.
radanskoric 1 days ago [-]
I think I'm getting what you mean and I almost completely agree with you, let me address one part, the only part where I don't agree:
> Plus you do gain testing power, because you can test more things. For example, you can confirm it returns _every_ active project.
Imagine this:
1. You start with some fixtures. You crafted the fixtures and you're happy that the fixtures are good for the test you're about to write.
2. You write a test where you assert the EXACT collection that is returned. This is, as you say, a test that "confirms the scope returns _every_ active project".
3. You now rewrite the test so that it checks that the collection includes ALL active projects and excludes all inactive projects.
Do you agree that nothing changed when you went from 2 to 3? As long as you don't change the fixtures, those 2 version of the test will behave exactly the same: if one passes so will the other and if one fails so will the other. As long as fixtures don't change they have exactly the same testing power.
If you agree on that, now imagine that you added another project to the fixtures. Has the testing power of the tests changed just because fixtures have been changed?
RHSeeger 1 days ago [-]
> If you agree on that, now imagine that you added another project to the fixtures. Has the testing power of the tests changed just because fixtures have been changed?
No, _but_ (and this is a big _but_) you're not testing the contract of the method, which (presumably) is to return all and only active projects.
Testing that it returns _some_ of the active methods is useful, but there are cases where it won't point out an issue. For example, image
- Over time, more tests are added "elsewhere" that use the same fixtures
- More active projects are added to the fixture to support those tests
- The implementation in the method is changed to be faster, and an off-by-one error is introduced; so the last project in the list isn't returned
In that ^ case, testing that _some_ of the active projects are returned will still return true; the bug won't be noticed.
Not directly related to the above, but I'll note that I would also split 2/3 into different tests.
- Make sure all projects returned are active
- Make sure projects returned includes all active projects
I think that's more of a style thing, but I _try_ to stick to each test testing one and only one thing. I don't always do that, but it's a rule of thumb for me.
radanskoric 1 days ago [-]
I'm with you on the one assertion per test. I bundled two assertions into the same test here because my whole point was to have them effectively together describe a single test, just in a more maintainable manner.
Regarding the fact that I'm not fully testing the contract of the method, you're absolutely correct. But also, no example based test suite is fully doing that. As long as the test suite is example based it is always possible to find a counter-case where the contract is violated but the test suite misses it.
These counter-cases will be more contrived and less likely the better the test suite. So all of us at some point decide that we've done enough and that more contrived cases are so unlikely and the cost of mistake is so small that it's not worth it to put in the extra testing effort. Some people don't explicitly think about it but that decision is still made one way or another.
This is a long way of saying that I both agree with you but that also, in most cases, I would still take the tradeoff and go for more maintainable tests.
sceptic123 1 days ago [-]
There's always a scenario where this can break though. What happens if someone introduces a test that confirms that marking `active1` as inactive works. Then it depends on the test order whether your initial test still passes.
radanskoric 1 days ago [-]
It's required for tests to clean up after themselves. With Rails and fixtures this is handled by default: each test runs inside a transaction which is rolled back at the end of the test. That way each test starts with the same initial state.
jillesvangurp 1 days ago [-]
I've been doing one thing for many years that forces me to be smart about test fixtures in my integration tests and test data: I test concurrently with many threads.
This means that my test can't depend on the database to be in some known state or assume to have exclusive access to that database. And for example modify anything that might be used by another test. They can only modify things that are specific to that test.
Most of my tests work around this limitation by either just creating their own teams, users, and other objects they need with randomized ids; or in some cases deferring their execution until some bit of logic with lock has created some shared data that then is never modified.
Instead of hard coded IDs, I tend to use randomized ids (UUIDs typically). I have a person data generator that gives me human readable names, email addresses, etc. Randomized data like this avoids test modifying each other's data.
As an example, we have a few tests for an analytics dashboard that locks on a bit of expensive code that creates a lot of content via our APIs to do analytics on. The scenario is quite elaborate and uses a few factories, known timestamps, etc. If I refactor my data model, my factories are also refactored. Using a lock ensures that data is initialized only once. Once that is done, there are a bunch of test that that test different queries against that.
You might think that all this is slow. It's not. I have about 380 integration tests like this that run in under 30 seconds on my laptop (which has a lot of CPU cores). Having this as a safety net is very empowering. I've been on teams that had less tests where running them took ten or more minutes. This I can do quickly before committing.
Testing like this has many advantages. But one includes easy to maintain tests. I put some effort into usable test data factories. The "when" part of a BDD style integration tests is usually most of the work. So, by making that as easy as I can, I lower the barrier for writing more tests. And using all my cpu cores, minimizes the impact new tests have on execution time to the point where I don't worry about that.
Another is that for big structural changes my tests continue to work if I just fix their shared factories to do the right thing usually.
tclancy 1 days ago [-]
I feel this is all contentious because, like so much in coding, we have people taking Thing That Works For Me on their current project or in their experience over time and declaring it to be The One True Way while other people are in completely different codebases with different priorities around shipping v quality v cost or what have you and are complaining "That doesn't work for me, Y has always been my go to".
The answer is most definitely, 100%, with no room for argument, to not speak so assuredly, acknowledge other people have the right to think differently and find synthesis and/ or a set of heuristics that apply for given cases.
But this is the Internet, and we need to be arguing PS2 vs X-Box for the rest of our lives, so have at it.
(Me? Factories are great until they aren't, which may not happen if a project or a team is small enough. Generators are great but do have some footguns and I would love to hand over everything to property-based testing, but I _feel_, without any experimenting or trying, they resist anything other than the purest of pure unit tests and can't help with integration tests that much.)
jrochkind1 1 days ago [-]
factories definitely are the cause of my test suite being slower than I'd like, I've been thinking of switching to fixtures. Good to see some context of what challenges I might be dealing with instead if I do.
radanskoric 1 days ago [-]
Author here. I'm a big fan of factories but the slowness is a real drag on large test suites. If you're considering switching, remember that you can do it gradually, there's no law against using both fixtures and factories in the same project, in some cases (mostly on very complex domain data models) even makes sense: fixtures for the base setup that all tests share, factories for additional test specific records.
Thanks! While I have you, since you seem to know what's up with this stuff, I'm going to ask you a question I have been curious about, in Rails land too.
While I see the pro's (and con's) of fixtures, one thing I do _not_ like is Rails ordinary way of specifying fixtures, in yaml files. Especially gets terrible for associations.
It's occured to me there's no reason I can't use FactoryBot to create what are actually fixtures -- as they will be run once, at test boot, etc. It would not be that hard to set up a little harness code to use FactoryBot to create objects at test boot and store them (or logic for fetching them, rather) in specified I dunno $fixtures[:some_name] or what have you for referal. And seems much preferable to me, as I consider switching to/introducing fixtures.
But I haven't seen anyone do this or mention it or suggest it. Any thoughts?
onionisafruit 1 days ago [-]
I use the pattern you describe, but not in Ruby. I use code to build fixtures through sql inserts. The code creates a new db whose name includes a hash of the test data (actually a hash the source files that build the fixtures).
Read-only tests only need to run the bootstrap code if their particular fixture hasn’t been created on that machine before. Same with some tests that write data but can be encapsulated in a transaction that gets rolled back at the end.
Some more complex tests need an isolated db because their changes can’t be contained in a db transaction (usually because the code under test commits a db transaction). These need to run the fixture bootstrap every time. We don’t have many of these so it’s not a big deal that they take a second or two. If we had more we would probably use separate, smaller fixtures for these.
radanskoric 1 days ago [-]
Your thinking is sound. At the end of the day Rails default fixtures is nothing more than some code that reads yaml files and creates records once at the start of test suite run.
So you can definitely use FactoryBot to create them. However, the reason I think that's rarely done is that you're pretty likely to start recreating a lot of the features of Rails fixtures yourself. And perhaps all you need to do is to dynamically generate the yaml files. Rails yaml fixtures are actually ERB files and you can treat is an ERB template and generate its code dynamically: https://guides.rubyonrails.org/testing.html#embedding-code-i...
If that is flexible enough for you, it's a better path since you'll get all the usual fixture helpers and association resolving logic for free.
jrochkind1 1 days ago [-]
Cool, thanks!
I feel like i don't _want_ the association resolving logic really, that's what I don't like! And if it's live ruby instead of YAML, it's easy to refer to another fixture object by just looking it up as a fixture like normal? (I guess there's order of operation issues though,hm).
And the rest seems straightforward enough, and better to avoid that "compile to yaml" stage for debugging and such.
We'll see, maybe I'll get around to trying it at some point, and release a perversely named factory_bot_fixtures gem. :)
yxhuvud 1 days ago [-]
What I feel is really missing from factories is the ability to do bulk inserts of a whole chain of entries (including of different kinds). That is where 95% of the inefficiency comes from. As an additional bonus it would make it easy to just list everything single record that was created for a spec
mnutt 1 days ago [-]
I have a large rails app that was plagued with slow specs using factory_bot. Associations in factories are especially dangerous given how easy it is to build up big dependency chains. The single largest speedup was noting that nearly every test was in the context of a user and org, and creating a default_user and default_org fixture.
jrochkind1 1 days ago [-]
That's a great, example, thanks.
Then you just refer to the fixture in your factory definitions? Seems very reasonable.
mijoharas 1 days ago [-]
there's a profiler that can show you what to focus on, probably fprof here: https://test-prof.evilmartians.io/ (been a while and I don't remember exactly what I used)
(now maybe that's what you used to see what was causing the slowdown, but mentioning to for others to help them identify the bottlenecks.)
erdaniels 1 days ago [-]
I think fixtures generally work fine. If a change to one breaks many tests, introduce a new one and start using that. I also think it's okay to make some manual changes to them in the test and it's distinct from wanting factories; needing factories only in test code feels like a waste.
100% agree with "Test only what you want to test".
stephen 1 days ago [-]
These two suggestions are fine, but I don't think they make fixtures really that much better--they're still a morass of technical debt & should be avoided at all costs.
The article doesn't mention what I hate most about fixtures: the noise of all the other crap in the fixture that doesn't matter to the current test scenario.
I.e. I want to test "merge these two books" -- great -- but now when stepping through the code, I have 30, 40, 100 other books floating around the code/database b/c "they were added by the fixture" that I need to ignore / step through / etc. Gah.
Author here. I didn't mention it because I wasn't writing an evaluation of fixtures. Just writing about how to make better use of fixtures. I actually use both fixtures and factories depending on the project specifics and also whether it is even my decision to make. :)
For a database-driven application with sqlalchemy, I've found mixer[0] to be pretty helpful. It gives you an easy way to generate an object, and it automatically creates dummy-objects that your object depends on.
You can also supply defaults and name schemes for individual columns.
For business logic, I prefer to have it structured in a way that it doesn't need the database for testing, but loading and searching stuff from the DB also needs to be tested, and for those, mixer strikes a really good balance. You only need to specify the attributes that are relevant for the test, and you don't need shared fixtures between many tests.
I’ve found that golden master tests (aka snapshot testing) pair very well with fixtures. If I need to add to the fixtures for a new test, I regenerate the golden files for all the known good tests. I barely need to glance at these changes because, as I said, they are known good. Still I usually give them a brief once over to make sure I didn’t do something like add too many records to a response that’s supposed to be a partial page. Then I go about writing the new test and implementing the change I’m testing. After implementing the change, only the new test’s golden files should change.
They are also nice because I don’t have to think so much about assertions. They automatically assert the response is exactly the same as before.
radanskoric 1 days ago [-]
I'm familiar with snapshot testing for UI and I agree with you, they can work really well for this because they're usually quick to verify. And especially if you can build in some smart tolerance to the comparison logic, it can be really easy to maintain.
But how would you do snapshot testing for behaviour? I'm approaching the problem primarily from the backend side and there most tests are about behaviour.
onionisafruit 1 days ago [-]
I'm also primarily on the back end. Like most backenders, I spend my workdays on http endpoints that return json. When I test these the "snapshot" is a json file with a pretty-printed version of the endpoint's response body. Tests fail when the file generated isn't the same as the existing file.
radanskoric 1 days ago [-]
Ah, Ok, yes, for API endpoints it makes a lot of sense. Especially if it's a public API, you need to inspect the output anyway, to ensure that the public contract is not broken.
But, I spend very little or no time on API endpoints since I don't work on projects where the frontend is an SPA. :)
Fire-Dragon-DoL 1 days ago [-]
The title was a bit confusing, what's frozen is the fixture definition (and shouldn't be).
The data created by the fixtures shouldn't be touched, or factories are being used, like the author suggested
I missed the "frozen". Ok, i understand the text better now. I think the issue is only with the frozen though, i understand why people would think it is necessary, but i think it should be avoided as much as possible, and fixtures, like data should be rewritten each time a data model change.
We have a solution. Not sure if it is elegant, but use it as an inspiration: it works.
When our project run its test, it will generate its database json representation itself (only using its models) with a file that contain fake/test data. That database representation will be loaded in the dev environment, and also in the database fixture that then run our tests. If our tests pass and we have an issue in dev, that mean our test missed something (that happen waaaaaay more often that i like to admit) and we have to add them.
Forcing every test to use this representation also force us to have a dev environment that contain enough items to run the test, and we can't forget to generate an item in the dev database, since that would mean our new feature isn't tested.
radanskoric 1 days ago [-]
Author here, thanks for posting. :)
immibis 1 days ago [-]
Your test fixtures are introducing tighter coupling between the tests, than the code they are testing! In this scenario (a mock DB with data that a test relies on, which is incompatible with a new test you want to add), duplicating the fixture is correct. Different tests with incompatible requirements on state should use different mock data. In this specific case, however, it's also true that one of them can be modified to make the tests compatible. If you do red-green-refactor then you can duplicate first, and then coalesce by changing the first test.
"assert_equal names, names.sort" is a wrong answer. It would accept an empty collection.
bluGill 1 days ago [-]
That depends on what state the fixture sets up and when you run them. That state becomes something you expect all tests to handle, if you change the fixture state and test breaks you should be fixing the production code - not the test code - to handle that change. Of course in reality I can well believe it is a tests data conflict in most cases (someone with the name "test user one" already is in the database at a different address..) and this is something you need to ensure doesn't happen.
I have a fixture that sets our database to the initial install state. This works for me because we are an embedded system where every month we ship a bunch more new systems and so code needs to see that initial install state, if we change the initial state (which we do all the time) and a test breaks we want to know and fix that since customers will see that situation.
However if you run on a server in a data center I could well believe you will never again see any specific state and so a fixture probably isn't right. Maybe ideally every test would take a snapshot of your current production database and test against that (with whatever additional data you add for the test) - if a customer enters data that breaks a test that is a "all hands on deck" to fix the code before customers hit that code path. Maybe - I don't work in this space and so I'm just speculating what you need.
immibis 41 minutes ago [-]
Different unit tests work on different state. One unit test says that if I start from a blank database, add two users, now there are two users. A different unit test says that if I start from a database with two users, and list the users, I get them.
If you rely on the add-user test running before the list-users test, you introduce bad coupling between tests. However, you could run them all in order in one test - it would be a more complex test. You could then run that on a real database and call it an integration test.
Tests at all levels of complexity are useful. You could run end-to-end tests on a production e-commerce system, from signup to payment with a real credit card and delivery of a real physical product, if you wanted. Backup power systems are tested by shutting off the power a few times a year.
But this article is clearly about unit tests. You shouldn't run unit tests against the production database and you should aim to minimize their dependencies.
Rendered at 20:57:31 GMT+0000 (Coordinated Universal Time) with Vercel.
Instead, if you're able to decouple the ORM from your application, with a separate layer, and instead pass plain objects around (not fat db backed models), one is much freer to write code that's "pure". This input gives that output. For tests like these one only needs to create whatever data structure the function desires, and then verify the output. Worst case verify that it called some mocks with x,y,z.
In reality, that is also not free. It imposes some restrictions on the code. Sometimes being pragmatic, backing off from the ideal leads to faster development and quicker deliver of value to the users. Rails is big on these pragmatic tradeoffs. The important thing is that we know when and why we're making the tradeoff.
Usually I go with Rails defaults and usually it's not a problem. Sometimes, when the code is especially complex and perhaps on the critical path, I turn up the purity dial and go down the road you describe exactly for the benefits you describe.
But when I decide that sticking to the defaults the is right tradeoff I want to get the most of it and use Fixtures (or Factories) in the optimal way.
No language or abstraction is perfect but if someone prefers pure functional coding, Rails and Django are just not it, don't try to make them. Others like em just as they are
Nevertheless I’ve found far more God classes that could be refactored into clean layers than the other way around. Specifically in the context of Rails style web app as GP is specially discussing. Batteries included doesn’t necessarily require large tangled God classes. One can just as well compose a series of layers into a strong default implementation that wraps complex behavior while allowing one to bail-out and recompose with necessary overrides, for example reasonable mocks in a test context.
Of course this could then allow one to isolate and test individual units easily, and circle back with an integration test of the overall component.
Still, most of us work on code bases with design issues either of our own making or somebody else’s.
Fixtures done right ensure that everyone starts with a good standard setup. The question is WHAT state the fixture setups. I have a fixture that setups a temporary data directory with nothing in it - you can setup your state, but everything will read from that temporary data directory.
Unit tests do have a place, but most of us are not writing code that has a strong well defined interface that we can't change. As such they don't add much value since changes to the code also imply changes to the code that uses them. When some algorithm is used in a lot of places unit tests it well - you wouldn't dare change it anyway, but when the algorithm is specific to the one place that calls it then there is no point in a separate test for it even though you could. (there is a lot of grey area in the middle where you may do a few unit tests but trust the comprehensive integration tests)
> Worst case verify that it called some mocks with x,y,z.
That is the worst case to avoid if at all possible (sometimes it isn't) that a function is called is an implementation details. Nobody cares. I've seen too many tests fail because I decided to change a function signature and now there is a new parameter A that every test needs to be updated to expect. Sometimes this is your only choice, but mock heavy tests are a smell in general and that is really what I'm against. Don't test implementation details, test what the customers care about is my point, and everything else follows from that (and where you have a different way that follows from that it may be a good think I want to know about!)
As for mocks I don't disagree, hence calling it worst case.
What often works for me is separating the code. For instance if I call a function that first queries the db and then marshall that data into something, it's often easier to test it by splitting it. One function that queries, that one can test with some db fixtures or other setup. And then another that gets a model in and only does the pure logic and returns the result. Can then be tested separately. And then a third function which is the new one, that just calls the first and pass the result into the second. Can be boilerplaty, so again, depends.
Foofactory() will automatically setup all the foreign key dependencies.
It can also generate fuzzy data, although having fuzzy data has its own issues in terms of brittle tests (if not done correctly).
[edit] though in my case we have one fixture that load a json representation of our dev dynamodb into moto, and thus we mock internal data, but this data is still read through our data models, it doesn't really replace internal code, only internal "mechanics"
As much in applications code it is easy to curb, for test code it is just really hard to get people to understand all this duplication that should be there in tests is GOOD.
There'll always be some duplication, but too much makes it harder to see the important stuff in a test.
I have lots of test fixtures each responsible for about 10 tests. It is very common to have 10-20 tests that share a startup configuration and then adjust it in various ways.
I'm not sure what you mean by inheritance in tests but DRY is criminally overused in tests. That could be a whole separate article but the tradeoffs are very different between test and app code and repetition in the test code is much less problematic and sometimes even desirable.
"Generators" for property-based testing might be similar to what the author is calling "factories." Generators create values of a given type, sometimes with particular properties, and can be combined to create generators of other types. (The terminology varies from one library to another. Different libraries use the terms "generators," "arbitraries," and "strategies" in slightly different and overlapping ways.)
For example, if you have a generator for strings and a generator for non-negative integers, it's trivial to create a generator for a type Person(name, age).
Generators can also be filtered. For example, if you have a generator for Account instances, and you need active Account instances in your test, you can apply a filter to the base generator to select only the instances where _.isActive is true.
Once you have a base generator for each type you need in your tests, the individual tests become clear and succinct. There is a learning curve for working with generators, but as a rule, the test code is very easy to read, even if it's tricky to write at first.
The problem arises when they're used to generate Database records, which is a common approach in Rails applications. Because you're generating a lot of them you end up putting a lot more load on the test database which slows down the whole test suite considerably.
If you use them to generate purely in memory objects, this problem goes away and then I also prefer to use factories (or generators, as you describe them).
Unfortunately, I'm not aware of a good property based testing library in Ruby, although it would be useful to have one.
Even so I'm guessing that property based testing in practice would be too resource intensive to test the entire application with it? You'd probably only test critical domain logic components and use regular example tests for the rest.
My rule was to randomize every property by default. The test needs to specify which property needs to have a certain value. E.g. set the address if you're testing something about the address.
So it was immediately obvious which properties a test relied on.
A clarification on terminology, the "property" in "property-based testing" refers to properties that code under test is supposed to obey. For example, in the author's Example 2, the collection being sorted is the property that the test is checking.
> This test has just made it impossible to introduce another active project without breaking it, even if the scope was not actually broken. Add a new variant of an active project for an unrelated test and now you have to also update this test.
And then goes on to test that the known active projects are indeed included in what the call to Project.active returns.
However, that doesn't test that "active scope returns active projects". Rather, it tests that
- active scope returns _at least some of the_ active projects, and
And it does not test that
- active scope returns _all_ of the active projects
- active scope does not return non-active projects
Which, admittedly, is only different because the original statement is ambiguous. But the difference is that the test will pass if it returns non-active projects, too; which probably is not the expected behavior.
I prefer to set things up so that my test fixtures (test data) are created as close to the test as possible, and then test it in the way the article is saying is wrong (in some cases)... ie, test that the call to Project.active returns _only_ those projects that should be active.
Another option would be to have 3 different tests that test all those things, but the second one (_all_ of the active projects) is going to fail if the text fixture changes to include more active projects.
The "doesn't include non-active projects objections is easy", please check the Example 1 test again, there's a line for that:
``` refute_includes active_projects, projects(:inactive) ```
Hm, if you missed it, perhaps I should have emphasised this part more, maybe add a blank line before it ...
Regarding the fact that the test does not check that the scope returns "all" active projects, that's a bit more complex to address but let me let tell you how I'm thinking about it:
The point of tests is to validate expected behaviours and prevent regressions (i.e. breaking old behaviour when introducing new features). It is impossible for tests to do this 100%. E.g. even if you test that the scope returns all active projects present in the fixtures that doesn't guarantee that the scope always returns all active projects for any possible list of active projects. If you want 100% validation your only choice is to turn to formal proof methods but that's whole different topic.
You could always add more active project examples. When you write a test that is checking that "Active projects A,B and C" are returned that is the same test as if your fixtures contained ONLY active projects A,B and C and then you tested that all of them are returned. In either case it is up to you to make sure that the projects are representative.
So by rewriting the test to check: 1. These example projects are included. 2. These other example projects are excluded.
You can write a test that is equally powerful as if you restricted your fixtures just to those example projects and then made an absolute comparison. You're not loosing any testing power. Expect you're making the test easier to maintain.
Does that make sense? Let me know which part is still confusing and I'll try to rephrase the explanation.
> The "doesn't include non-active projects objections is easy", please check the Example 1 test again, there's a line for that:
You're correct; I totally missed that.
> In either case it is up to you to make sure that the projects are representative.
That's fair, but that's also the point you're trying to address / make more robust by how you're trying to write tests (what the article is about). Specifically
- The article is about: How to make sure you're tests are robust against test fixtures changing
- That comment says: It's up to you to make sure your test fixtures don't change in a way that breaks your tests
> You can write a test that is equally powerful as if you restricted your fixtures just to those example projects and then made an absolute comparison. You're not loosing any testing power. Expect you're making the test easier to maintain.
By restricting your fixtures to just the projects (that are relevant to the test), you're making _the tests_ easier to maintain; not just the one test but the test harness as a whole. What I mean is that you're reducing "action at a distance". When you modify the data for your test, you don't need to worry about what other tests, somewhere else, might also be impacted.
Plus you do gain testing power, because you can test more things. For example, you can confirm it returns _every_ active project.
All that being said, what I'm talking about relies on creating the test data local to the tests. And doing that has a cost (time, generally). So there's a tradeoff there.
> Plus you do gain testing power, because you can test more things. For example, you can confirm it returns _every_ active project.
Imagine this:
1. You start with some fixtures. You crafted the fixtures and you're happy that the fixtures are good for the test you're about to write.
2. You write a test where you assert the EXACT collection that is returned. This is, as you say, a test that "confirms the scope returns _every_ active project".
3. You now rewrite the test so that it checks that the collection includes ALL active projects and excludes all inactive projects.
Do you agree that nothing changed when you went from 2 to 3? As long as you don't change the fixtures, those 2 version of the test will behave exactly the same: if one passes so will the other and if one fails so will the other. As long as fixtures don't change they have exactly the same testing power.
If you agree on that, now imagine that you added another project to the fixtures. Has the testing power of the tests changed just because fixtures have been changed?
No, _but_ (and this is a big _but_) you're not testing the contract of the method, which (presumably) is to return all and only active projects.
Testing that it returns _some_ of the active methods is useful, but there are cases where it won't point out an issue. For example, image
- Over time, more tests are added "elsewhere" that use the same fixtures
- More active projects are added to the fixture to support those tests
- The implementation in the method is changed to be faster, and an off-by-one error is introduced; so the last project in the list isn't returned
In that ^ case, testing that _some_ of the active projects are returned will still return true; the bug won't be noticed.
Not directly related to the above, but I'll note that I would also split 2/3 into different tests.
- Make sure all projects returned are active
- Make sure projects returned includes all active projects
I think that's more of a style thing, but I _try_ to stick to each test testing one and only one thing. I don't always do that, but it's a rule of thumb for me.
Regarding the fact that I'm not fully testing the contract of the method, you're absolutely correct. But also, no example based test suite is fully doing that. As long as the test suite is example based it is always possible to find a counter-case where the contract is violated but the test suite misses it.
These counter-cases will be more contrived and less likely the better the test suite. So all of us at some point decide that we've done enough and that more contrived cases are so unlikely and the cost of mistake is so small that it's not worth it to put in the extra testing effort. Some people don't explicitly think about it but that decision is still made one way or another.
This is a long way of saying that I both agree with you but that also, in most cases, I would still take the tradeoff and go for more maintainable tests.
This means that my test can't depend on the database to be in some known state or assume to have exclusive access to that database. And for example modify anything that might be used by another test. They can only modify things that are specific to that test.
Most of my tests work around this limitation by either just creating their own teams, users, and other objects they need with randomized ids; or in some cases deferring their execution until some bit of logic with lock has created some shared data that then is never modified.
Instead of hard coded IDs, I tend to use randomized ids (UUIDs typically). I have a person data generator that gives me human readable names, email addresses, etc. Randomized data like this avoids test modifying each other's data.
As an example, we have a few tests for an analytics dashboard that locks on a bit of expensive code that creates a lot of content via our APIs to do analytics on. The scenario is quite elaborate and uses a few factories, known timestamps, etc. If I refactor my data model, my factories are also refactored. Using a lock ensures that data is initialized only once. Once that is done, there are a bunch of test that that test different queries against that.
You might think that all this is slow. It's not. I have about 380 integration tests like this that run in under 30 seconds on my laptop (which has a lot of CPU cores). Having this as a safety net is very empowering. I've been on teams that had less tests where running them took ten or more minutes. This I can do quickly before committing.
Testing like this has many advantages. But one includes easy to maintain tests. I put some effort into usable test data factories. The "when" part of a BDD style integration tests is usually most of the work. So, by making that as easy as I can, I lower the barrier for writing more tests. And using all my cpu cores, minimizes the impact new tests have on execution time to the point where I don't worry about that.
Another is that for big structural changes my tests continue to work if I just fix their shared factories to do the right thing usually.
The answer is most definitely, 100%, with no room for argument, to not speak so assuredly, acknowledge other people have the right to think differently and find synthesis and/ or a set of heuristics that apply for given cases.
But this is the Internet, and we need to be arguing PS2 vs X-Box for the rest of our lives, so have at it.
(Me? Factories are great until they aren't, which may not happen if a project or a team is small enough. Generators are great but do have some footguns and I would love to hand over everything to property-based testing, but I _feel_, without any experimenting or trying, they resist anything other than the purest of pure unit tests and can't help with integration tests that much.)
Btw, I also have an article with some of my learnings using factories and I make a remark on how it helps with test speed: https://radanskoric.com/articles/test-factories-principal-of...
While I see the pro's (and con's) of fixtures, one thing I do _not_ like is Rails ordinary way of specifying fixtures, in yaml files. Especially gets terrible for associations.
It's occured to me there's no reason I can't use FactoryBot to create what are actually fixtures -- as they will be run once, at test boot, etc. It would not be that hard to set up a little harness code to use FactoryBot to create objects at test boot and store them (or logic for fetching them, rather) in specified I dunno $fixtures[:some_name] or what have you for referal. And seems much preferable to me, as I consider switching to/introducing fixtures.
But I haven't seen anyone do this or mention it or suggest it. Any thoughts?
Read-only tests only need to run the bootstrap code if their particular fixture hasn’t been created on that machine before. Same with some tests that write data but can be encapsulated in a transaction that gets rolled back at the end.
Some more complex tests need an isolated db because their changes can’t be contained in a db transaction (usually because the code under test commits a db transaction). These need to run the fixture bootstrap every time. We don’t have many of these so it’s not a big deal that they take a second or two. If we had more we would probably use separate, smaller fixtures for these.
So you can definitely use FactoryBot to create them. However, the reason I think that's rarely done is that you're pretty likely to start recreating a lot of the features of Rails fixtures yourself. And perhaps all you need to do is to dynamically generate the yaml files. Rails yaml fixtures are actually ERB files and you can treat is an ERB template and generate its code dynamically: https://guides.rubyonrails.org/testing.html#embedding-code-i...
If that is flexible enough for you, it's a better path since you'll get all the usual fixture helpers and association resolving logic for free.
I feel like i don't _want_ the association resolving logic really, that's what I don't like! And if it's live ruby instead of YAML, it's easy to refer to another fixture object by just looking it up as a fixture like normal? (I guess there's order of operation issues though,hm).
And the rest seems straightforward enough, and better to avoid that "compile to yaml" stage for debugging and such.
We'll see, maybe I'll get around to trying it at some point, and release a perversely named factory_bot_fixtures gem. :)
Then you just refer to the fixture in your factory definitions? Seems very reasonable.
(now maybe that's what you used to see what was causing the slowdown, but mentioning to for others to help them identify the bottlenecks.)
100% agree with "Test only what you want to test".
The article doesn't mention what I hate most about fixtures: the noise of all the other crap in the fixture that doesn't matter to the current test scenario.
I.e. I want to test "merge these two books" -- great -- but now when stepping through the code, I have 30, 40, 100 other books floating around the code/database b/c "they were added by the fixture" that I need to ignore / step through / etc. Gah.
Factories are the way: https://joist-orm.io/testing/test-factories/
Personally, I even slightly prefer to use Factories and I also previously wrote about a better way to use them: https://radanskoric.com/articles/test-factories-principal-of...
You can also supply defaults and name schemes for individual columns.
For business logic, I prefer to have it structured in a way that it doesn't need the database for testing, but loading and searching stuff from the DB also needs to be tested, and for those, mixer strikes a really good balance. You only need to specify the attributes that are relevant for the test, and you don't need shared fixtures between many tests.
[0]: https://pypi.org/project/mixer/
They are also nice because I don’t have to think so much about assertions. They automatically assert the response is exactly the same as before.
But how would you do snapshot testing for behaviour? I'm approaching the problem primarily from the backend side and there most tests are about behaviour.
But, I spend very little or no time on API endpoints since I don't work on projects where the frontend is an SPA. :)
The data created by the fixtures shouldn't be touched, or factories are being used, like the author suggested
We have a solution. Not sure if it is elegant, but use it as an inspiration: it works.
When our project run its test, it will generate its database json representation itself (only using its models) with a file that contain fake/test data. That database representation will be loaded in the dev environment, and also in the database fixture that then run our tests. If our tests pass and we have an issue in dev, that mean our test missed something (that happen waaaaaay more often that i like to admit) and we have to add them.
Forcing every test to use this representation also force us to have a dev environment that contain enough items to run the test, and we can't forget to generate an item in the dev database, since that would mean our new feature isn't tested.
"assert_equal names, names.sort" is a wrong answer. It would accept an empty collection.
I have a fixture that sets our database to the initial install state. This works for me because we are an embedded system where every month we ship a bunch more new systems and so code needs to see that initial install state, if we change the initial state (which we do all the time) and a test breaks we want to know and fix that since customers will see that situation.
However if you run on a server in a data center I could well believe you will never again see any specific state and so a fixture probably isn't right. Maybe ideally every test would take a snapshot of your current production database and test against that (with whatever additional data you add for the test) - if a customer enters data that breaks a test that is a "all hands on deck" to fix the code before customers hit that code path. Maybe - I don't work in this space and so I'm just speculating what you need.
If you rely on the add-user test running before the list-users test, you introduce bad coupling between tests. However, you could run them all in order in one test - it would be a more complex test. You could then run that on a real database and call it an integration test.
Tests at all levels of complexity are useful. You could run end-to-end tests on a production e-commerce system, from signup to payment with a real credit card and delivery of a real physical product, if you wanted. Backup power systems are tested by shutting off the power a few times a year.
But this article is clearly about unit tests. You shouldn't run unit tests against the production database and you should aim to minimize their dependencies.