Did you know you can query Excel (.xlsx) files using EF Core? Let me show you how.

Start by installing the EntityFrameworkCore.Jet package.

dotnet add package EntityFrameworkCore.Jet

This is an EF Core provider built on top of the Jet/ACE database engine which enables connecting to Access and Excel files.

There are two different drivers for connecting to the engine–OleDB and ODBC. For this example, we’ll use OleDB. To do that, we also need to install the System.Data.OleDb package

dotnet add package System.Data.OleDb

Now, let’s assume we have an Excel file named Signups.xlsx containing a single sheet with the following data.

Name Phone number Party size
Brice 555-5551 4
Ryan 555-5552 1
David 555-5553 2

The first thing we need to do is tell EF how to connect to the file. Inside your DbContext.OnConfiguring (or similar), call UseJet with an appropriate connection string.

class SignupContext : DbContext
{
    protected override void OnConfiguring(DbContextOptionsBuilder options)
        => options.UseJet(
            """
            Provider = Microsoft.ACE.OLEDB.12.0;
            Data Source = Signups.xlsx;
            Extended Properties = 'Excel 12.0 Xml';
            """);
}

This is an OleDB connection string that says to use the ACE provider. It contains the path to our Signups.xlsx file. And finally, it includes additional properties that tell the provider we’re connecting to an modern Excel file.

Normally, we’d ask EF to reverse engineer a model and scaffold the appropriate classes, but the EF Core Jet provider is created primarily with Access databases in mind, so it currently isn’t able to reverse engineer Excel files.

Instead, we’ll use the following code to see what tables and columns are available.

using var db = new SignupContext();

db.Database.OpenConnection();
var connection = db.Database.GetDbConnection();

using var tables = connection.GetSchema("Tables");
foreach (DataRow table in tables.Rows)
{
    var tableName = (string)table["TABLE_NAME"];
    Console.WriteLine(tableName);

    var command = connection.CreateCommand();
    command.CommandType = CommandType.TableDirect;
    command.CommandText = tableName;
    using var reader = command.ExecuteReader(CommandBehavior.SchemaOnly);

    using var columns = reader.GetSchemaTable();
    foreach (DataRow column in columns.Rows)
    {
        Console.WriteLine($"    {column["DataType"]} {column["ColumnName"]}");
    }
}

When we run it against our Signups.xlsx file, we get the following output.

Sheet1$
    System.String Name
    System.String Phone number
    System.Double Party size

With this information, we’re able to create a class that maps to our spreadsheet.

[Keyless, Table("Sheet1$")]
class SignupEntry
{
    public string Name { get; set; }

    [Column("Phone number")]
    public string PhoneNumber { get; set; }

    [Column("Party size")]
    public double PartySize { get; set; }
}

I’m using a keyless entity type since I’m only going to be reading the data. If your data has a column (or columns) that can serve as the primary key, you probably want to specify them in your model.

Don’t forget to add a DbSet property to your DbContext.

public DbSet<SignupEntry> Signups { get; set; }

And that’s it! You should be able to query your Excel file just like any other data source.

var partyCount = db.Signups.Count();
Console.WriteLine($"Parties: {partyCount}");

var averagePartySize = db.Signups.Average(s => s.PartySize);
Console.WriteLine($"Average size: {averagePartySize}");

var largestParty = db.Signups.OrderByDescending(s => s.PartySize).First();
Console.WriteLine($"Largest: {largestParty.Name}, party of {largestParty.PartySize}");
Parties: 3
Average size: 2.3333333333333335
Largest: Brice, party of 4