Add internal DataFrame.toCode utility to help convert examples#1603
Add internal DataFrame.toCode utility to help convert examples#1603
Conversation
| schema().print() | ||
| } | ||
|
|
||
| fun DataFrame<*>.toCode(variableName: String = "df"): String = |
There was a problem hiding this comment.
- Could this return a
CodeString? similar togenerateInterfaces()etc. - I was a bit worried about the name, but now I look in the autocomplete; among
toHtml(),toList(),toJson(), "toCode()" is not that cluttering or weird :) (actually, it's thegenerateX()functions that stand out...) - I do think that the
val %variableName =in the beginning should be omitted. We probably can't assume that people who want to use it will use it solely as an assignment. Maybe they want to use the generated dataframe-creating code to be an expression, like:
exec("val list = listOf(${df.toCode()})")There was a problem hiding this comment.
Could this return a CodeString? similar to generateInterfaces() etc.
I think yes. I'd also consider CodeString.copyToClipboard()
I do think that the val %variableName = in the beginning should be omitted.
What if nullable parameter: variableName: String?
There was a problem hiding this comment.
Yes, good idea for both :)
| } | ||
|
|
||
| private fun StringBuilder.appendColumn(column: DataColumn<Any?>, pad: String) { | ||
| append("$pad\"${column.name()}\" to columnOf(") |
There was a problem hiding this comment.
maybe it's safer to include the type in columnOf explicitly :) It solves a lot of the .toLiteral() cases already, like columnOf<Long>(1, 2, 3), but even ones you don't yet catch, like columnOf<Short>(1, 2, 3)
There was a problem hiding this comment.
I'll add Short and other literals, would prefer to avoid redundant type parameter so it's easy to copy-paste as is
There was a problem hiding this comment.
As long as we can somehow guarantee that the provided literals/class initializations are always the expected type, it will be fine. Maybe we could also parameterize this option if we actually want to promote this method to the public API?
There was a problem hiding this comment.
I'll setup extensive testing with Kotest https://kotest.io/docs/proptest/property-based-testing.html and Jupyter Kernel to execute generated code and compare resulting instance to original + compilation
| append("),\n") | ||
| } | ||
|
|
||
| private fun Any?.toLiteral(): String = |
There was a problem hiding this comment.
You can supply a KType to the function, which saves instance checks :)
There was a problem hiding this comment.
i think instance check is very cheap on JVM and we have a smartcast as a result
There was a problem hiding this comment.
Would you prefer cheap or free? ;P but yeah, the smart casts are nice
There was a problem hiding this comment.
How free? :o Maybe i don't get the idea
There was a problem hiding this comment.
let's keep instance checks, i added a lot of branches where smart casts are useful
There was a problem hiding this comment.
It's free because you get the values from a DataColumn where the KType is already known. So if we would supply this KType to toLiteral() it's simply an equality check instead of an instance check.
| is Char -> "'$this'" | ||
| is Long -> "${this}L" | ||
| is Float -> "${this}f" | ||
| else -> toString() |
There was a problem hiding this comment.
Lists might be another nice (and commonly used) addition. Maybe maps?
Actually, that does point out a limitation of the current implementation; it's not extensible and easy to break with unsupported types.
Maybe we could add a lambda to this function so people can supply their own 'toCode renderer'. Something like
typealias ValueToCodeRenderer = (value: Any?, type: KType) -> CodeString
val toStringRenderer: ValueToCodeRenderer = { value, _ -> value.toString() }
fun DataFrame<*>.toCode(customRenderer: ValueToCodeRenderer = toStringRenderer): CodeStringwdyt?
There was a problem hiding this comment.
List and Map seem reasonable by default. Code renderer too
dataFrameOf is compiler plugin-friendly way to serialize dataframes without FrameColumns, but converting existing row-based dataFrameOf(names)(values), etc, is bothersome. This API could help to quickly go from notebook to gradle project or from tests to project
I put it in tests module with intention to maybe later expose it as API