How does Haskell know whether a data type declaration is a variable or a named type?

  • A+

Take a data type declaration like

data myType = Null | Container TypeA v  

As I understand it, Haskell would read this as myType coming in two different flavors. One of them is Null which Haskell interprets just as some name of a ... I guess you'd call it an instance of the type? Or a subtype? Factor? Level? Anyway, if we changed Null to Nubb it would behave in basically the same way--Haskell doesn't really know anything about null values.

The other flavor is Container and I would expect Haskell to read this as saying that the Container flavor takes two fields, TypeA and v. I expect this is because, when making this type definition, the first word is always read as the name of the flavor and everything that follows is another field.

My question (besides: did I get any of that wrong?) is, how does Haskell know that TypeA is a specific named type rather than an un-typed variable? Am I wrong to assume that it reads v as an un-typed variable, and if that's right, is it because of the lower-case initial letter?

By un-typed I mean how the types appear in the following type-declaration for a function:

func :: a -> a  func a = a 


First of all, terminology: "flavors" are called "cases" or "constructors". Your type has two cases - Null and Container.

Second, what you call "untyped" is not really "untyped". That's not the right way to think about it. The a in declaration func :: a -> a does not mean "untyped" the same way variables are "untyped" in JavaScript or Python (though even that is not really true), but rather "whoever calls this function chooses the type". So if I call func "abc", then I have chosen a to be String, and now the compiler knows that the result of this call must also be String, since that's what the func's signature says - "I take any type you choose, and I return the same type". The proper term for this is "generic".

The difference between "untyped" and "generic" is that "untyped" is free-for-all, the type will only be known at runtime, no guarantees whatsoever; whereas generic types, even though not precisely known yet, still have some sort of relationship between them. For example, your func says that it returns the same type it takes, and not something random. Or for another example:

mkList :: a -> [a] mkList a = [a] 

This function says "I take some type that you choose, and I will return a list of that same type - never a list of something else".

Finally, your myType declaration is actually illegal. In Haskell, concrete types have to be Capitalized, while values and type variables are javaCase. So first, you have to change the name of the type to satisfy this:

data MyType = Null | Container TypeA v 

If you try to compile this now, you'll still get an error saying that "Type variable v is unknown". See, Haskell has decided that v must be a type variable, and not a concrete type, because it's lower case. That simple.

If you want to use a type variable, you have to declare it somewhere. In function declaration, type variables can just sort of "appear" out of nowhere, and the compiler will consider them "declared". But in a type declaration you have to declare your type variables explicitly, e.g.:

data MyType v = Null | Container TypeA v 

This requirement exist to avoid confusion and ambiguity in cases where you have several type variables, or when type variables come from another context, such as a type class instance.

Declared this way, you'll have to specify something in place of v every time you use MyType, for example:

n :: MyType Int n = Null  mkStringContainer :: TypeA -> String -> MyType String mkStringContainer ta s = Container ta s  -- Or make the function generic mkContainer :: TypeA -> a -> MyType a mkContainer ta a = Container ta a 


:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: