158

I have the following SQL, which I am trying to translate to LINQ:

SELECT f.value
FROM period as p 
LEFT OUTER JOIN facts AS f ON p.id = f.periodid AND f.otherid = 17
WHERE p.companyid = 100

I have seen the typical implementation of the left outer join (ie. into x from y in x.DefaultIfEmpty() etc.) but am unsure how to introduce the other join condition (AND f.otherid = 17)

EDIT

Why is the AND f.otherid = 17 condition part of the JOIN instead of in the WHERE clause? Because f may not exist for some rows and I still want these rows to be included. If the condition is applied in the WHERE clause, after the JOIN - then I don't get the behaviour I want.

Unfortunately this:

from p in context.Periods
join f in context.Facts on p.id equals f.periodid into fg
from fgi in fg.DefaultIfEmpty()
where p.companyid == 100 && fgi.otherid == 17
select f.value

seems to be equivalent to this:

SELECT f.value
FROM period as p 
LEFT OUTER JOIN facts AS f ON p.id = f.periodid 
WHERE p.companyid = 100 AND f.otherid = 17

which is not quite what I'm after.

RustyTheBoyRobot
  • 5,891
  • 4
  • 36
  • 55
dan
  • 5,664
  • 8
  • 45
  • 59
  • Sweet! I've been looking for this for awhile but wasn't sure how to search for this. Not sure how to add tags to this answer. Here's the search criteria I used: linq to sql filter in join or from linq to sql where clause in join or from – Solburn Sep 21 '10 at 14:59

7 Answers7

261

You need to introduce your join condition before calling DefaultIfEmpty(). I would just use extension method syntax:

from p in context.Periods
join f in context.Facts on p.id equals f.periodid into fg
from fgi in fg.Where(f => f.otherid == 17).DefaultIfEmpty()
where p.companyid == 100
select f.value

Or you could use a subquery:

from p in context.Periods
join f in context.Facts on p.id equals f.periodid into fg
from fgi in (from f in fg
             where f.otherid == 17
             select f).DefaultIfEmpty()
where p.companyid == 100
select f.value
JaredMcAteer
  • 21,688
  • 5
  • 49
  • 65
dahlbyk
  • 75,175
  • 8
  • 100
  • 122
32

this works too, ...if you have multiple column joins

from p in context.Periods
join f in context.Facts 
on new {
    id = p.periodid,
    p.otherid
} equals new {
    f.id,
    f.otherid
} into fg
from fgi in fg.DefaultIfEmpty()
where p.companyid == 100
select f.value
ZenXavier
  • 321
  • 3
  • 5
17

I know it's "a bit late" but just in case if anybody needs to do this in LINQ Method syntax (which is why I found this post initially), this would be how to do that:

var results = context.Periods
    .GroupJoin(
        context.Facts,
        period => period.id,
        fk => fk.periodid,
        (period, fact) => fact.Where(f => f.otherid == 17)
                              .Select(fact.Value)
                              .DefaultIfEmpty()
    )
    .Where(period.companyid==100)
    .SelectMany(fact=>fact).ToList();
Prokurors
  • 2,458
  • 3
  • 40
  • 65
  • 2
    Very useful to see the lambda version! – Learner Dec 20 '17 at 14:47
  • 3
    `.Select(fact.Value)` should be `.Select(f => f.Value)` – Petr Felzmann Oct 11 '19 at 05:29
  • Beware. In my tests, this does not produce _desirable_ SQL. Instead of adding additional `JOIN...ON` conditions, this creates a subquery that first _finds all_ facts with `otherid == 17`, and then joins the subquery result instead of the table. Such an approach tends to lead to extensive index scans rather than using the index that was intended to support the join. Specifically, the subquery does not limit the rows it scans to those that would match the outer sequence, `Periods`. – Timo Nov 04 '22 at 11:59
  • Apologies, I stand corrected: **With a sufficiently large data set, SQL Server will optimize the subquery, applying the outer condition to filter it appropriately.** – Timo Nov 04 '22 at 14:07
5

Another valid option is to spread the joins across multiple LINQ clauses, as follows:

public static IEnumerable<Announcementboard> GetSiteContent(string pageName, DateTime date)
{
    IEnumerable<Announcementboard> content = null;
    IEnumerable<Announcementboard> addMoreContent = null;
        try
        {
            content = from c in DB.Announcementboards
              // Can be displayed beginning on this date
              where c.Displayondate > date.AddDays(-1)
              // Doesn't Expire or Expires at future date
              && (c.Displaythrudate == null || c.Displaythrudate > date)
              // Content is NOT draft, and IS published
              && c.Isdraft == "N" && c.Publishedon != null
              orderby c.Sortorder ascending, c.Heading ascending
              select c;

            // Get the content specific to page names
            if (!string.IsNullOrEmpty(pageName))
            {
              addMoreContent = from c in content
                  join p in DB.Announceonpages on c.Announcementid equals p.Announcementid
                  join s in DB.Apppagenames on p.Apppagenameid equals s.Apppagenameid
                  where s.Apppageref.ToLower() == pageName.ToLower()
                  select c;
            }

            // Add the specified content using UNION
            content = content.Union(addMoreContent);

            // Exclude the duplicates using DISTINCT
            content = content.Distinct();

            return content;
        }
    catch (MyLovelyException ex)
    {
        // Add your exception handling here
        throw ex;
    }
}
MAbraham1
  • 1,717
  • 4
  • 28
  • 45
  • wouldn't it be slower than doing the whole operation in a single linq query? – Umar T. Jun 23 '17 at 11:16
  • @umar-t, Yes most likely, considering this was more than eight years ago when I wrote it. Personally I like the correlated sub-query postulated by Dahlbyk here https://stackoverflow.com/a/1123051/212950 – MAbraham1 Jun 23 '17 at 17:11
  • 1
    A "union" is a different operation than a "cross-join". It's like addition vs. multiplication. – Suncat2000 Nov 18 '19 at 17:14
  • 1
    @Suncat2000, thank you for the correction. Happy Thanksgiving! – MAbraham1 Nov 25 '19 at 19:57
1

Can be written using composite join key. Also if there is need to select properties from both left and right sides the LINQ can be written as

var result = context.Periods
    .Where(p => p.companyid == 100)
    .GroupJoin(
        context.Facts,
        p => new {p.id, otherid = 17},
        f => new {id = f.periodid, f.otherid},
        (p, f) => new {p, f})
    .SelectMany(
        pf => pf.f.DefaultIfEmpty(),
        (pf, f) => new MyJoinEntity
        {
            Id = pf.p.id,
            Value = f.value,
            // and so on...
        });
Petr Felzmann
  • 1,271
  • 4
  • 19
  • 39
-1

Whilst my response below does not directly answer the question, I believe it offers an alternative to the core issue that a read might find valuable.

I end up at this thread and others looking for the EF equivalent of a simple self join SQL I had written. I included Entity Framework in my project to make my DB interactions easier, but having to use "GroupJoin" , "SelectMany" and "DefaultIfEmpty" is like having to translate to another language.

Additionally I work with engineer who are excellent at SQL but have limited C# skills. So I want a solution that they could read.

The solution that worked for me was:

context.Database.SqlQuery<Class>

This allows executing of SQL commands with result return in a typed object. So long as the column names returned match the property names of the given Class. For example:

 public class MeasurementEvent
{
    public int ID { get; set; }
    public string JobAssemID { get; set; }
    public DateTime? InspDate { get; set; }


}

var list = context.Database.SqlQuery<MeasurementEvent>(@"
                Select op.umeMeasurementEventID as ID, op.umeJobID+'.'+Cast(op.umeAssemblyID as varchar) as JobAssemID ,  insp.umeCreatedDate as InspDate 
                from uMeasurementEvents as op 
                    left JOIN   uMeasurementEvents as insp on op.umeJobID = insp.umeJobID and op.umeAssemblyID = insp.umeAssemblyID and insp.umeInstanceId = 1 and insp.umeIsInspector = 1
                  where  op.umeInstanceId = 1  and op.umeIsInspector = 0")
            .ToList();
Russell Munro
  • 479
  • 5
  • 9
-3

It seems to me there is value in considering some rewrites to your SQL code before attempting to translate it.

Personally, I'd write such a query as a union (although I'd avoid nulls entirely!):

SELECT f.value
  FROM period as p JOIN facts AS f ON p.id = f.periodid
WHERE p.companyid = 100
      AND f.otherid = 17
UNION
SELECT NULL AS value
  FROM period as p
WHERE p.companyid = 100
      AND NOT EXISTS ( 
                      SELECT * 
                        FROM facts AS f
                       WHERE p.id = f.periodid
                             AND f.otherid = 17
                     );

So I guess I agree with the spirit of @MAbraham1's answer (though their code seems to be unrelated to the question).

However, it seems the query is expressly designed to produce a single column result comprising duplicate rows -- indeed duplicate nulls! It's hard not to come to the conclusion that this approach is flawed.

onedaywhen
  • 55,269
  • 12
  • 100
  • 138