How to improve data access performance in EF Core
How to improve data access performance in EF Core
Entity Framework Core (EF Core) is an open source ORM (object-relational mapping) framework that bridges the gap between the object model of your application and the data model of your database. EF Core makes life simpler by allowing you to work with the database using .NET objects, instead of having to write data access code.
In other words, EF Core lets you write code to execute CRUD actions (create, read, update, and delete) without understanding how the data is persisted in the underlying database. You can more easily retrieve entities from the data store, add, change, and delete entities, and traverse entity graphs by working directly in C#.
You can improve data access performance in EF Core in many different ways, ranging from using eager loading to reducing the database round trips required by your queries. In this article, we will explore 10 tips and tricks or strategies we can use in EF Core to improve the data access performance of our .NET Core applications.
To work with the code examples provided below, you should have Visual Studio 2022 installed in your system. If you don’t already have a copy, you can download Visual Studio 2022 here.
Create a console application project in Visual Studio
First off, let’s create a .NET Core console application project in Visual Studio. Assuming Visual Studio 2022 is installed in your system, follow the steps outlined below to create a new .NET Core console application project.
- Launch the Visual Studio IDE.
- Click on “Create new project.”
- In the “Create new project” window, select “Console App (.NET Core)” from the list of templates displayed.
- Click Next.
- In the “Configure your new project” window, specify the name and location for the new project.
- Click Next.
- In the “Additional information” window shown next, choose “.NET 7.0 (Standard Term Support)” as the Framework version you want to use.
- Click Create.
We’ll use this project to work with EF Core 7 throughout this article. In the sections that follow, we’ll discuss 10 ways we can improve data access speed in EF Core, illustrated by code examples wherever appropriate. Let’s get started!
Retrieve only the data you need
When dealing with massive volumes of data, you should strive to retrieve only the required records for the specific query. When fetching data, you should use projections to pick just the required fields and avoid retrieving unnecessary fields.
The following code snippet shows how to obtain data in a paged fashion. Notice how the beginning page index and page size have been used to choose just the required data.
int pageSize = 50, startingPageIndex = 1;
var dataContext = new OrderProcessingDbContext();
var data = dataContext.Orders.Take(pageSize)
.Skip(startingPageIndex * pageSize)
.ToList();
Split your large data context into many smaller data contexts
The data context in your application represents your database. Hence, you may wonder whether the application should have only one or more data contexts. In Entity Framework Core, the startup time of a large data context represents a significant performance constraint. As a result, instead of using a single vast data context, you should break the data context into numerous smaller data contexts.
Ideally, you should only have one data context per module or unit of work. To use multiple data contexts, simply create a new class for each data context and extend it from the DbContext class.
Use batch updates for large numbers of entities
The default behavior of EF Core is to send individual update statements to the database when there is a batch of update statements to be executed. Naturally, multiple hits to the database entail a significant performance overhead. To change this behavior and optimize batch updates, you can take advantage of the UpdateRange() method as shown in the code snippet given below.
public class DataContext : DbContext { public void BatchUpdateAuthors(List<Author> authors) { var students = this.Authors.Where(a => a.Id >10).ToList(); this.UpdateRange(authors); SaveChanges(); } protected override void OnConfiguring (DbContextOptionsBuilder options) { options.UseInMemoryDatabase("AuthorDb"); } public DbSet<Author> Authors { get; set; } public DbSet<Book> Books { get; set; } }
If you’re using EF Core 7 or later, you can use the ExecuteUpdate and ExecuteDelete methods to perform batch updates and eliminate multiple database hits. For example:
_context.Authors.Where(a => a.Id > 10).ExecuteUpdate();
Disable change tracking for read-only queries
The default behavior of EF Core is to track objects retrieved from the database. Tracking is required when you want to update an entity with new data, but it is a costly operation when you’re dealing with large data sets. Hence, you can improve performance by disabling tracking when you won’t be modifying the entities.
For read-only queries, i.e., when you want to retrieve entities without modifying them, you should use AsNoTracking to improve performance. The following code snippet illustrates how AsNoTracking can be used to disable tracking for an individual query in EF Core.
var dbModel = await this._context.Authors.AsNoTracking() .FirstOrDefaultAsync(e => e.Id == author.Id);
The code snippet given below shows how you can retrieve entities directly from the database for read-only purposes, without tracking and without loading them into the memory.
public class DataContext : DbContext { public IQueryable<Author> GetAuthors() { return Set<Author>().AsNoTracking(); } }
Use DbContext pooling
An application typically has multiple data contexts. Because DbContext objects may be costly to create and dispose of, EF Core offers a mechanism for pooling them. By pooling, DbContext objects are created once, then reused when needed.
Using a DbContext pool in EF Core can improve performance by reducing the overhead involved in building and disposing of DbContext objects. Your application may also use less memory as a result.
The following code snippet illustrates how you can configure DbContext pooling in the Program.cs file.
builder.Services.AddDbContextPool<MyDbContext>(options => options.UseSqlServer(connection));
Use IQueryable instead of IEnumerable
When you’re quering data in EF Core, use IQueryable instead of IEnumerable. When you use IQueryable, the SQL statements will be executed on the server side, where the data is stored, whereas IEnumerable requires the query to be executed on the client side. Moreover, while IQueryable supports query optimizations and lazy loading, IEnumerable does not. This explains why IQueryable executes queries faster than IEnumerable.
The following code snippet shows how you can use IQueryable to query data.
IQueryable<Author> query = _context.Authors; query = query.Where(e => e.Id == 5); query = query.OrderBy(e => e.Id); List<Author> entities = query.ToList();
Use eager loading instead of lazy loading
EF Core uses lazy loading by default. With lazy loading, the related entities are loaded into the memory only when they are accessed. The benefit is that data aren’t loaded unless they are needed. However, lazy loading can be costly in terms of performance because multiple database queries may be required to load the data.
To solve this problem for specific scenarios, you can use eager loading in EF Core. Eager loading fetches your entities and related entities in a single query, reducing the number of round trips to the database. The following code snippet shows how eager loading can be used.
public class DataContext : DbContext { public List<Author> GetEntitiesWithEagerLoading() { List<Author> entities = this.Set<Author>() .Include(e => e.Books) .ToList(); return entities; } }
Disable lazy loading
By eliminating the need to load unnecessary related entities (as in explicit loading), lazy loading seems to relieve the developer from dealing with related entities entirely. Because EF Core is adept at automatically loading related entities from the database when accessed by your code, lazy loading seems like a nice feature.
However, lazy loading is especially prone to generating unnecessary additional round trips, which could slow down your application. You can turn off lazy loading by specifying the following in your data context:
ChangeTracker.LazyLoadingEnabled = false;
Use asynchronous instead of synchronous code
You should use async code to improve the performance and responsiveness of your application. Below I’ll share a code example that shows how you can execute queries asynchronously in EF Core. First, consider the following two model classes.
public class Author { public int Id { get; set; } public string FirstName { get; set; } public string LastName { get; set; } public List<Book> Books { get; set; } } public class Book { public int Id { get; set; } public string Title { get; set; } public Author Author { get; set; } }
In the code snippet that follows, we’ll create a custom data context class by extending the DbContext class of EF Core library.
public class DataContext : DbContext { protected readonly IConfiguration Configuration; public DataContext(IConfiguration configuration) { Configuration = configuration; } protected override void OnConfiguring (DbContextOptionsBuilder options) { options.UseInMemoryDatabase("AuthorDb"); } public DbSet<Author> Authors { get; set; } public DbSet<Book> Books { get; set; } }
Note that we’re using an in-memory database here for simplicity. The following code snippet illustrates how you can use async code to update an entity in the database using EF Core.
public async Task<int> Update(Author author) { var dbModel = await this._context.Authors .FirstOrDefaultAsync(e => e.Id == author.Id); dbModel.Id = author.Id; dbModel.FirstName = author.FirstName; dbModel.LastName = author.LastName; dbModel.Books = author.Books; return await this._context.SaveChangesAsync(); }
Reduce the round trips to the database
You can significantly reduce the number of round trips to the database by avoiding the N+1 selects problem. The N+1 selects problem has plagued database performance since the early days of ORMs. The name refers to the problem of sending N+1 small queries to the database to retrieve data that could be retrieved with one big query.
In EF Core, the N+1 problem can occur when you’re trying to load data from two tables having a one-to-many or many-to-many relationship. For example, let’s say you’re loading author data from the Authors table and also book data from the Books table. Consider the following piece of code.
foreach (var author in this._context.Authors) { author.Books.ForEach(b => b.Title.ToUpper()); }
Note that the outer foreach loop will fetch all authors using one query. This is the “1” in your N+1 queries. The inner foreach that fetches the books represents the “N” in your N+1 problem, because the inner foreach will be executed N times.
To solve this problem, you should fetch the related data in advance (using eager loading) as part of the “1” query. In other words, you should include the book data in your initial query for the author data, as shown in the code snippet given below.
var entitiesQuery = this._context.Authors .Include(b => b.Books); foreach (var entity in entitiesQuery) { entity.Books.ForEach(b => b.Title.ToUpper()); }
By doing so, you reduce the number of round trips to the database from N+1 to just one. This is because by using Include, we enable eager loading. The outer query, i.e., the entitiesQuery, executes just once to load all the author records together with the related book data. Instead of making round trips to the database, the two foreach loops work on the available data in the memory.
Incidentally, EF Core 7 reduces some round trips to the database for free. How so? The transaction management for single insert statements was dropped from EF Core 7 because it is no longer necessary. As a result, EF Core 7 omits two round trips that were used in previous versions of EF Core to begin and commit a transaction. The upshot is that EF Core 7 provides a significant performance gain when inserting data into a database using a single insert statement compared to predecessors.
Performance should be a feature
In this article we examined 10 key strategies you can use to improve data access performance in EF Core. Additionally, you should fine-tune your database design, indexes, queries, and stored procedures to get maximum benefits. Performance should be a feature of your application. It is imperative that you keep performance in mind from the outset whenever you are building applications that use a lot of data.
Finally, every application has different data access requirements and characteristics. You should benchmark your EF Core performance before and after you apply any of the changes we discussed here to assess the results for your specific application. An excellent tool for the task is BenchmarkDotNet, which you can read about in my previous post here.