Get number of duplicates

I have some code that gets every line in a file and I’d to get the duplicate text and the number of times it appears.


        Dim query =
        From line In System.IO.File.ReadAllLines(path)
        Let logRecord = line.Split("*"c)
        Order By logRecord(2)
        Select New Log With {.DateField = logRecord(0), .Ip = logRecord(1), .Comment = logRecord(2)}

But I can’t get the duplicate count.


Dim d As New Dictionary(Of String, Integer)
        For Each q In query
            If Not d.ContainsKey(q.Comment)
               d.add(q.Comment, 0)
            End If
        Next

Thanks!

Use a LINQ Group operator to group by the “text” - and then use Where to select the groups with more than one member.

I also have this, which I don’t know to get the values…


        Dim b = From a In query _
                Order By a.Comment _
                Group a By a.Comment Into aCount = Group _
                Where aCount.Count() >= 1 _
                Select aCount

        For Each a In b
            ltrText.Text &= a.ToString
        Next

will say Log Log

I’m not strong in VB, but I believe that you are pretty close. The “b” returned from the first statement is actually a collection of collections (or rather an enumeration of groups).

Your second statement iterates through the outer enumeration (the groups). Inside the iteration statement you should be able to access both a count and set up a new iteration through the group items.

If you want to concatenate string properties from the items you should take a look at the Aggregate LINQ operator or select the string items, combine them into an array (ToArray() function) and then “join” the items using the string/array join static method.