How (or why) is `Data.Set String` not a single type?

  • A+

I am studying Haskell the hard way, by trying to write something I find interesting, and right now I'm trying to figure out how to derive a Semiring in Haskell for a specific set of parsing problems:

class Semiring s where     zero, one :: s     mul, add  :: s -> s -> s  instance Semiring Bool where     zero = False     one = True     add = (||)     mul = (&&)  instance Semiring (Set String) where     zero    = empty      one     = singleton ""     add a b = union a b     mul a b = (/(c, d) -> c ++ d) $ cartesianProduct a b 

The Bool ({true, false}, ∨, ∧, false, true) version works great. So does an Int version. The last one is called a Parse Forest, and its representation is (E, ∪, ·, ∅, {<>}), where E is a set of strings and the {<>} is the set of the empty string.

When I try to compile this, I get:

Rigge…   115  10 error           • Illegal instance declaration for ‘Semiring (Set String)’ (All instance types must be of the form (T a1 ... an) where a1 ... an are *distinct type variables*, and each type variable appears at most once in the instance head. 

Which doesn't make that much sense to me. Set String is a distinct type, right, and all of the operations of the class Semiring should be expressible purely in terms of sets of strings.

If you want context, the project is at Rigged Regular Expressions. The Bool version merely reports back that the regex matched; an Int version reports back the number of different ways a regex could have matched (i.e "a" ~ /(a|a*)/ will return 2 because two distinct and unique subexpressions match); the ParseForest should return not the number of ways, but the set of all possible ways— but it can't, because I don't understand why I can't use a concrete data type, Set String, where another concrete data type like Int or Bool worked okay.


chi's answer describes how to do this by turning on an extension, which is perfectly fine. But if you wonder how anyone could get by without this extension, there are a couple approaches.

The simplest change would be to introduce a newtype wrapper, to explicitly get rid of the type variable yourself before defining an instance.

newtype StringSet = StringSet (Set String) instance Semiring StringSet where {...} 

Alternatively, it seems to me that you don't need to be so specific as String: your instance would work for any Monoid type, wouldn't it?

instance (Ord a, Monoid a) => Semiring (Set a) where   zero = empty   one = singleton mempty   add = union   mul a b = (uncurry (<>)) $ cartesianProduct a b 


:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: