A crazy idea: provide a reasonable toString for case classes
See original GitHub issueWhen having discussions with people in the community, I saw a misconception that as soon as compiler developers have some idea, it quickly gets implemented and shipped in the next release. At the same time, the misconception suggest that ideas from outsiders never get support and are never implemented.
The source of this misconception is that decisions that originate from compiler developers that do are not supported by majority in compiler team(s) are never publicly presented. This creates a biased distribution for ideas that are presented to public: most of them are already fleshed out, discussed in private and preliminary decision is frequently made. They have been privately discussed for month if not years in order to cover all the corner-cases of both abstract idea and practical implementation.
This issue exists to show discussion that normally happens in private, to demonstrate that quite a few of ideas that we have are not implemented, as not good enough
. This is an example of such a idea, that may simplify everyday life of a particular developer (me) but I believe it’s not good enough (yet?) to be implemented.
The proposal
This proposal proposes a best-effort to make toString on case classes be JSON. Another good option is proposed by @densh in comment below. I don’t have preference over either of two currently, as his version has also some convincing advantages, such as repl compatibility. Additionally, AFAIK this is what Python does.
Introduction:
I’ve been thinking about it for a year or so already, sometimes leaning towards “it’s very easy to implement, and it makes common usage easier, so why not do it”, while sometimes leaning towards “I like JSON as much as I like XML”.
Recently, we’ve had a lot of discussions with @felixmulder on how should dottydoc compiler part and web-part communicate. And it seemed that what he intended to implement is comparable in amount of code to implementing inside the compiler.
I’d like to finally have this discussion, mostly to gather opinions of people and see if we should do it. If we don’t do it in Dotty, I guess it’s better to never do it. So why not have discussion now?
Motivation
When discussing including serializer in standard library, in most cases this is seen as a bad idea, as there are multiple formats available and we do not want to bear the burden of supporting all of them. In particular there have always been an argument not to include JSON\XML\Protobuf serializer in standard library\compiler as we’ll need to maintain it.
Current status
Compiler generates a toString on case classes that calls into
ScalaRunTime..MODULE$._toString((Product)this);
That is implemented as
x.productIterator.mkString(x.productPrefix + "(", ",", ")")
Idea
TLDR: replace parenthesis with brackets and include field names. Add a best effort fromString(s): Option[CaseClass]
method.
Thinking about it, we already have a toString defined on all classes. So we already have a conversion from every case-class to a String. I propose to make this string try to be JSON.
And make companion object have a fromString
method that does best-effort in trying to create the class from a JSON
string.
Design guidelines
Several details of how I would prefer it to work:
toString
generated for case-classes emits a pretty-printed version of JSON, that can be compacted by a library function that strips whitespaces;fromString(s: String): Option[CaseClass]
is generated on companions of case classes;fromString
does not use reflection, but instead relies on existence offromString
for all the fields;fromString
on fields that are not case classes can be user-defiend in a companion of a field class or added through implicit conversion;fromString
is provided for some parts of stdlib such as primitives, commonly used classes such asjava.util.Date
, standard collections. For java-defined classes it is added through implicit conversion;- when we’ll have ADTs, generate a
fromString
for it automatically based on the hierarchy; - the implemented
fromString
andtoString
do not try to provide best performance and fit all requirements. They aren’t to replace JSON serializer but to provide an easy to use and convenient default that makes sense(unlike current toString that is impossible to read on nested case classes)
Proposed implementation
(c: CaseClass).toString()
- creates a string json entry called
_class$
with a fully specified class name of what is being serialized; - calls toString an all fields;
- creates a JSON string by computing
\n\t${field-name}: ${fieldValue}
to every field, concatenates it with field from step 1 and wraps all this in{}
.- This allows to create a pretty-printed version using a single StringBuffer that should be comparable to speed as current toString. It may get a bit slower due to having to append whitespaces and field names compared to current implementation.
fromString(s: String): Option[CaseClass]
- does not assume presence of whitespaces in the string and does not try to be very efficient. Does not always work if fields have crazy toStrings.
- checks if class name is the same, if not returns None
- splits the string to delegate parsing of fields to other
fromString
s, taking into account structure of JSON. It would need to import the scope of companion class of field type. See example below for illustration - if some of nested
fromStrings
calls for fields returned None, returns None - Otherwise, constructs Some(caseClass) using provided fields.
Demonstration:
case class TestClass(a: Int, b: Double) {
def toString = // synthesized by desugar as an untyped tree
q"{\"_class$$\": \"TestClass\",\n\t\"a\": ${a.toString},\n\t\"b\": ${b.toString}}"
}
case object TestClass$ {
def fromString(s: String): Option[TestClass] = {
val maping: Map[String, String] = callLibraryFunctionThatParsesSingleLevelOfJson
if (mapping.getOrElse("class$", "") != "TestClass") None
else {
val a: Option[Int] = {
import Int._
mapping.get("a").flatMap(fromString) // uses scope search to find one
// it is provided by companion of class scala.Int and imported above
}
val b: Option[Double] = {
import Double._
mapping.get("b").flatMap(fromString) // uses scope search to find one
// it is provided by companion of class scala.Double and imported above
}
if (a.isEmpty || b.isEmpty) None
else Some(TestClass(a.get, b.get))
}
}
}
case class TestWrapper(tc: TestClass, date: java.util.Date) {
def toString = // synthesized by desugar as an untyped tree
q"{\"_class$$\": \"TestWrapper\",\n\t\"tc\": ${tc.toString},\n\t\"date\": ${date.toString}}"
}
case object TestWrapper$ {
def fromString(s: String): Option[TestWrapper] = {
val maping: Map[String, String] = callLibraryFunctionThatParsesSingleLevelOfJson
if (mapping.getOrElse("class$", "") != "TestWrapper") None
else {
val tc: Option[TestClass] = {
import TestClass._
mapping.get("tc").flatMap(fromString) // uses scope search to find one
// it is provided by import above
}
val date: Option[java.util.Date] = {
import java.util.Date._
mapping.get("date").flatMap(fromString) // uses scope search to find one
// it is provided in DottyPrefer\similar place
}
if (tc.isEmpty || date.isEmpty) None
else Some(TestWrapper(tc.get, date.get))
}
}
}
Arrays & stuff that has crazy toString
First of all, the provided version of toString does not guarantee that it returns a valid JSON. And even if it does return a valid one, it provides no guarantee that you can deserialize from it. Take example of
case class Arr(a: Array[Int], stuff: BadClass)
class BadClass { def toString = "\n"}
I’m fine with this state, as this proposal does not aim to replace serialization frameworks and it does not make current situation with useless toString worse, while it would help in many common cases.
But If we decide that we would like to always generate valid JSON, we can instead of calling a toString
in this code:
def toString = // synthesized by desugar as an untyped tree
q"{\"_class$$\": \"TestClass\",\n\t\"a\": ${a.toString},\n\t\"b\": ${b.toString}}"
call a toJsonString
method, that has a low-priority implicit decorator implementation that calls toString
and provides correct escaping. Case classes would have a native toJsonString
that forwards to toString
Drawbacks
- toString becomes slower;
- there may be existing code that relies on current behavior of toString, for example by splitting result on parentheses;
- users may start relying on toString\fromString instead of using proper serialization;
- if JSON fades away as XML did, we may get stuck with it;
- JSON may not be the best format as it even has problems with proper support of Longs in practice;
Alternatives
- leave as is;
- simply change
ScalaRuntime.toString(Product):String
: to include some pretty printing; - another good option is proposed by @densh in comment
Real motivation
- I hate reading toString of dotty trees). We do have
show
, but if I need to read the internal raw structure of raw tree I still use Lisp-formatter in my emacs. - DottyDoc needs to emmit JSON. It would be nice to do it without external dependencies.
Issue Analytics
- State:
- Created 7 years ago
- Reactions:5
- Comments:16 (12 by maintainers)
Top GitHub Comments
Counter-idea: make case classes
.toString
generate Scala code that creates an instance that is identical to this one.Array($elem1, $elem2, ..., $elemN)
just like all the other collectionsExample
Currently:
This counter-proposal:
Related work
Li Haoyi’s pprint: http://www.lihaoyi.com/upickle-pprint/pprint/
I’d rather we invest our energy in coming up with a mechanism for abstracting over the structure of case classes so that we can express the boilerplate the compiler generates for case classes in the library.
Pardon my poor quasi-quote hygiene, but wouldn’t it be cool if you could implement
toString
once like this and have a specialized version in each ofHasToString
’s subclasses?