Joins and subqueries
To join one table to another, so load also related models you can use following methods.
select_related(related: Union[List, str]) -> QuerySet
select_all(follow: bool = True) -> QuerySet
-
prefetch_related(related: Union[List, str]) -> QuerySet
-
Model
Model.load()
method
-
QuerysetProxy
QuerysetProxy.select_related(related: Union[List, str])
methodQuerysetProxy.select_all(follow: bool=True)
methodQuerysetProxy.prefetch_related(related: Union[List, str])
method
select_related
select_related(related: Union[List, str]) -> QuerySet
Allows to prefetch related models during the same query.
With select_related
always only one query is run against the database, meaning
that one (sometimes complicated) join is generated and later nested models are processed in
python.
To fetch related model use ForeignKey
names.
To chain related Models
relation use double underscores between names.
Note
If you are coming from django
note that ormar
select_related
differs ->
in django
you can select_related
only single relation types, while in ormar
you can select related across ForeignKey
relation, reverse side of ForeignKey
(so virtual auto generated keys) and ManyToMany
fields (so all relations as of current version).
Tip
To control which model fields to select use fields()
and exclude_fields()
QuerySet
methods.
Tip
To control order of models (both main or nested) use order_by()
method.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
1 2 3 4 5 6 7 |
|
You can provide a string or a list of strings (or a field/ list of fields)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
|
1 2 3 4 5 6 7 8 9 10 |
|
Exactly the same behavior is for Many2Many fields, where you put the names of Many2Many
fields and the final Models
are fetched for you.
Warning
If you set ForeignKey
field as not nullable (so required) during all
queries the not nullable Models
will be auto prefetched, even if you do not include
them in select_related.
Note
All methods that do not return the rows explicitly returns a QuerySet instance so you can chain them together
So operations like filter()
, select_related()
, limit()
and offset()
etc. can be chained.
Something like Track.object.select_related("album").filter(album__name="Malibu").offset(1).limit(1).all()
select_all
select_all(follow: bool = False) -> QuerySet
By default when you select all()
none of the relations are loaded, likewise,
when select_related()
is used you need to explicitly specify all relations that should
be loaded. If you want to include also nested relations this can be cumberstone.
That's why select_all()
was introduced, so by default load all relations of a model
(so kind of opposite as with all()
approach).
By default adds only directly related models of a parent model (from which the query is run).
If follow=True
is set it adds also related models of related models.
Info
To not get stuck in an infinite loop as related models also keep a relation to parent model visited models set is kept.
That way already visited models that are nested are loaded, but the load do not follow them inside. So Model A -> Model B -> Model C -> Model A -> Model X will load second Model A but will never follow into Model X. Nested relations of those kind need to be loaded manually.
With sample date like follow:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
|
To select all Companies
with all Branches
and Addresses
you can simply query:
1 2 3 4 |
|
Of course in this case it's quite easy to issue explicit relation names in select_related
,
but the benefit of select_all()
shows when you have multiple relations.
If for example Company
would have 3 relations and all of those 3 relations have it's own
3 relations you would have to issue 9 relation strings to select_related
, select_all()
is also resistant to change in names of relations.
Note
Note that you can chain select_all()
with other QuerySet
methods like filter
, exclude_fields
etc.
To exclude relations use exclude_fields()
call with names of relations (also nested) to exclude.
prefetch_related
prefetch_related(related: Union[List, str]) -> QuerySet
Allows to prefetch related models during query - but opposite to select_related
each
subsequent model is fetched in a separate database query.
With prefetch_related
always one query per Model is run against the database,
meaning that you will have multiple queries executed one after another.
To fetch related model use ForeignKey
names.
To chain related Models
relation use double underscores between names.
Tip
To control which model fields to select use fields()
and exclude_fields()
QuerySet
methods.
Tip
To control order of models (both main or nested) use order_by()
method.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
1 2 3 4 5 6 7 8 |
|
You can provide a string, or a list of strings
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
|
1 2 3 4 5 6 7 8 9 |
|
Exactly the same behavior is for Many2Many fields, where you put the names of Many2Many
fields and the final Models
are fetched for you.
Warning
If you set ForeignKey
field as not nullable (so required) during all
queries the not nullable Models
will be auto prefetched, even if you do not include
them in select_related.
Note
All methods that do not return the rows explicitly returns a QuerySet instance so you can chain them together
So operations like filter()
, select_related()
, limit()
and offset()
etc. can be chained.
Something like Track.object.select_related("album").filter(album__name="Malibu").offset(1).limit(1).all()
select_related vs prefetch_related
Which should you use -> select_related
or prefetch_related
?
Well, it really depends on your data. The best answer is try yourself and see which one performs faster/better in your system constraints.
What to keep in mind:
Performance
Number of queries:
select_related
always executes one query against the database,
while prefetch_related
executes multiple queries. Usually the query (I/O) operation is
the slowest one but it does not have to be.
Number of rows: Imagine that you have 10 000 object in one table A and each of those objects have 3 children in table B, and subsequently each object in table B has 2 children in table C. Something like this:
1 2 3 4 5 6 7 8 9 10 11 |
|
That means that select_related
will always return 60 000 rows (10 000 * 3 * 2) later
compacted to 10 000 models.
How many rows will return prefetch_related
?
Well, that depends, if each of models B and C is unique it will return 10 000 rows in first query, 30 000 rows (each of 3 children of A in table B are unique) in second query and 60 000 rows (each of 2 children of model B in table C are unique) in 3rd query.
In this case select_related
seems like a better choice, not only it will run one query
comparing to 3 of
prefetch_related
but will also return 60 000 rows comparing to 100 000
of prefetch_related
(10+30+60k).
But what if each Model A has exactly the same 3 models B and each models C has exactly
same models C? select_related
will still return 60 000 rows, while prefetch_related
will return 10 000 for model A,
3 rows for model B and 2 rows for Model C. So in total 10 006 rows. Now depending on the
structure of models (i.e. if it has long Text() fields etc.) prefetch_related
might be faster despite it needs to perform three separate queries instead of one.
Memory
ormar
is does not keep a registry of already loaded models.
That means that in select_related
example above you will always have 10 000 Models A,
30 000 Models B
(even if the unique number of rows in db is 3 - processing of select_related
spawns
new child models for each parent model). And 60 000 Models C.
If the same Model B is shared by rows 1, 10, 100 etc. and you update one of those, the
rest of rows that share the same child will not be updated on the spot. If you
persist your changes into the database the change will be available only after reload
(either each child separately or the whole query again). That means
that select_related
will use more memory as each child is instantiated as a new object
- obviously using it's own space.
Note
This might change in future versions if we decide to introduce caching.
Warning
By default all children (or event the same models loaded 2+ times) are completely independent, distinct python objects, despite that they represent the same row in db.
They will evaluate to True when compared, so in example above:
1 2 3 4 5 6 7 8 |
|
but
1 2 3 4 5 |
|
On the contrary - with prefetch_related
each unique distinct child model is
instantiated only once and the same child models is shared across all parent models.
That means that in prefetch_related
example above if there are 3 distinct models in
table B and 2 in table C, there will be only 5 children nested models shared between all
model A instances. That also means that if you update any attribute it will be updated
on all parents as they share the same child object.
Model methods
Each model instance have a set of methods to save
, update
or load
itself.
load
You can load the ForeignKey
related model by calling load()
method.
load()
can be used to refresh the model from the database (if it was changed by some other process).
Tip
Read more about load()
method in models methods
QuerysetProxy methods
When access directly the related ManyToMany
field as well as ReverseForeignKey
returns the list of related models.
But at the same time it exposes subset of QuerySet API, so you can filter, create, select related etc related models directly from parent model.
select_related
Works exactly the same as select_related function above but allows you to fetch related objects from other side of the relation.
Tip
To read more about QuerysetProxy
visit querysetproxy section
select_all
Works exactly the same as select_all function above but allows you to fetch related objects from other side of the relation.
Tip
To read more about QuerysetProxy
visit querysetproxy section
prefetch_related
Works exactly the same as prefetch_related function above but allows you to fetch related objects from other side of the relation.
Tip
To read more about QuerysetProxy
visit querysetproxy section