Releases: Kotlin/dataframe
v0.14: Kotlin 2.0 and many stability improvements
This release can mostly be described as a quality-of-life release. While there are not many new groundbreaking features at the moment, almost every part of the library has had some improvement. See the full list of changes below, but to highlight a few:
- We now officially support Kotlin 2.0+. The library is built with 2.0.20 now, so it will work with KSP 2.0.20 too.
- We've continued our work on the DataFrame Kotlin Compiler Plugin. While it is still experimental, it introduces an exciting new approach to working with your data in a zero-boilerplate, type safe way leveraging the amazing power the Kotlin 2.0 compiler gives us. See this demo project to experiment with it yourself.
- See this notebook for some of the small yet exciting features of the 0.14 release!
0.14.2
Includes the fix: #934 which removes the slf4j-simple
dependency, keeping just slf4j-api
.
0.14.1
Includes the fix: #872 which fixes compatibility with Kandy v0.7.1.
Features
- Compiler plugin by @koperagen in #729
- added toDataFrame for float- and double iterables by @Jolanrensen in #631
- Allow any ArrowReader implementation to be use for reading Arrow data #627 by @fb64 in #628
- add random parameter to shuffle by @koperagen in #643
- apply ksp to multiplatform configs in multiplatform modules by @mgroth0 in #647
- Add separator parameter to DataFrame.flatten by @zaleslaw in #667
- POJO toDataFrame support (and array improvements) by @Jolanrensen in #650
- Add JDBC credentials extraction from env variables and improve exception handling by @zaleslaw in #692
- Added MS SQL support for the dataframe-jdbc module by @zaleslaw in #689
- Update SQL all table/schemas reading functions to return maps with table names by @zaleslaw in #718
- Add a support for H2 modes by @zaleslaw in #720
- Add delimiter parameter to readDelimStr by @koperagen in #743
- Add an option to read Excel cell values as a String regardless of their content type by @koperagen in #745
- Add castTo to help working with implicitly generated schemas in notebooks and plugin by @koperagen in #747
- Replace Klaxon with kotlinx-serialization by @devcrocod in #603
- Add df.convertTo(schemaFrom) overload by @koperagen in #764
- Add Convert.asFrame function by @koperagen in #781
- Add extension functions for the ResultSet by @zaleslaw in #772
Work on the compiler plugin
- then operation in pivot column selection DSL inside aggregate by @koperagen in #617
- Compiler plugin fixes by @koperagen in #740
- Compiler plugin update by @koperagen in #755
- Adding utils to help ensure that compile time schema ~ runtime schema by @koperagen in #767
- Improve codegen for stdlib <-> df interop workflow by @koperagen in #763
- Add initial support for CS DSL in the compiler plugin by @koperagen in #783
- Refactor toDataFrame implementation in compiler plugin by @koperagen in #782
- [Compiler plugin] Avoid throwing debugging exceptions in user projects because of false positives by @koperagen in #788
- [Compiler plugin] silently abort interpretation in case of invariant errors by @koperagen in #812
- [Compiler plugin ] Support ColumnName annotation in extension properties codegen by @koperagen in #818
- Update compiler plugin by @koperagen in #832
Fixes
- Added "debug mode" to catch type mismatches in columns during testing only by @Jolanrensen in #713
- Removing dangerous exception in ConvenienceSchemaGeneratorPlugin by @Jolanrensen in #655
- add gradle wrapper validation workflow by @sullis in #648
- Fix #640: Jupyter integration conflicts with variable type converters from other integrations by @ark-1 in #641
- KDoc fixes for IntelliJ 2024.X by @Jolanrensen in #613
- Fix some things that break in the build with K2 enabled by @koperagen in #665
- Add nullability inference support to dataframe-jdbc by @zaleslaw in #672
- Serialize BufferedImages as base64 by @ermolenkodev in #694
- Downsize test images to speed up unit tests execution by @ermolenkodev in #709
- KTNB-693 Send the full dataframe schema as metadata by @cmelchior in #706
- K2 preparation by @Jolanrensen in #708
- File and URL names are used in JDK / suggested in IDE auto-import. Rename our classes to avoid conflict by @koperagen in #725
isComparable()
fix for doubledescribe()
by @Jolanrensen in #726- Fixes std() not inferring column types by @Jolanrensen in #728
- Pivot fix by @Jolanrensen in #735
- Enforce the project to be built with Java 11 by @Jolanrensen in #736
- Rename private properties with names matching library classes by @koperagen in #737
- Add a test for valueCounts by @koperagen in #756
- Revert "Add a test for valueCounts" by @koperagen in #757
- Revert filter condition, fixes broken logic by @koperagen in #759
- Add a test for valueCounts by @koperagen in #758
- Add a notice about external urls by @koperagen in #774
- Make sortBy(ColumnReference) accept pathOf without extra cast by @koperagen in #779
- Update MarkersExtractor.kt by @GeorgCantor in #791
- Build config and Debug mode by @Jolanrensen in #796
- This fixes Nothing by @Jolanrensen in #795
- Small convertTo fix by @Jolanrensen in #800
- Small implode fix by @Jolanrensen in #801
- GroupBy aggregate fix by @Jolanrensen in #803
- Support for the serialization of DataframeConvertible values in ValueColumns has been added to enhance visualization in the KTNB plugin. by @ermolenkodev in #823
- Hide properties of intermediate objects and remove data class attributes by @koperagen in #828
- Fixes reading CSV files with BOM characters by @Jolanrensen in #831
- Improve dataframe sorting in KTNB UI by handling non-comparable columns by @ermolenkodev in #836
- Empty csv fix by @Jolanrensen in #835
- Added test suite for PostgreSQL local tests with covering case with URLs by @zaleslaw in #798
- Enforcing kotlinx datetime bias by @Jolanrensen in #843
- allow applying ksp to custom configs by @mgroth0 in #842
- Fix duplicated jvm signature by @koperagen in #868
- Fixed flaky jsonWrite test by @Jolanrensen in #873
- Rerunning notebooks for 0.14.0-RC1 by @Jolanrensen in #870
- Remove samplesTest task from kover report so "build" don't trigger it by @koperagen in #865
- Improved SQL<->JDBC mapping by @zaleslaw in #855
- Add Language annotations for better IDE experience by @koperagen in #861
Docs and Examples
- Update docs and readme for 0.13.1 by @Jolanrensen in #629
- Fixed a documentation for SQL database connection in Kotlin Notebooks by @zaleslaw in #639
- Updated parse documentation by @Jolanrensen in #652
- Updated split documentation by @Jolanrensen in https://github.com/Kotlin/...
0.13.1 Columns Selection DSL, KDocs, Table Rendering, and Many Fixes!
We just released v0.13.1!
Documentation/Readme might not be up-to-date yet. However, feel free to test it in your project and let us know if something does not work as expected!
To sum up the most significant changes:
- We finally merged the new ColumnsSelection DSL (documentation will come too). This comes with KDocs everywhere, previously missing overloads, and clearer function names.
- We're continuously improving support for Kotlin Notebook by creating and improving a native table component for DataFrames, by @ermolenkodev
- (Nested) table output now looks better for notebooks on Github (for example)
- Improvements to Arrow reading (thanks, @fb64 and @Kopilov!)
- Many other fixes and version bumps, see below!
Check it out on Maven Central
Features
- ColumnsSelection DSL overhaul by @Jolanrensen in #372
- Add static rendering for GitHub by @ileasile in #480
- Static notebook fixes and examples by @Jolanrensen in #511
- Better static tables by @Jolanrensen in #519
- Added ! to DataColumnArithmetics by @Jolanrensen in #531
- Add read of Arrow TimeStamp without timezone as LocalDatetime #515 by @fb64 in #516
- Add readArrowReader method to allow loading a dataframe from an ArrowReader by @fb64 in #529
- Excel add new sheet without overwriting the file by @LeandroC89 in #157
- make
forEach
inline to enable composable and suspend calls in lambda by @koperagen in #572 - Add option to prevent Gradle plugin from adding dependency on KSP by @koperagen in #571
- add infer types parameter in DSL functions by @koperagen in #579
- Reading Arrow NullVector by @Kopilov in #550
Fixes
- Fix performance problem in
rename
implementation by @nikitinas in #532 - Ported the fix for JDBC integration from the 0.12.1 branch by @zaleslaw in #538
- Fix reading Excel file that has cells with formulas that return string by @jmrsnt in #484
- Catching Error too in
isOpenApiStr
by @Jolanrensen in #490 - Turning off the feedback widget in documentation as it's non-functional by @Jolanrensen in #491
- Fixes related to Kotlin Notebook plugin integration by @ermolenkodev in #501
- remove klaxon usage outside json.kt by @koperagen in #506
- use range indexing to get rows subset by @koperagen in #514
- Linter for all modules by @Jolanrensen in #521
- add analytics initialization script by @koperagen in #524
- Fix toDataFrame for value classes by @koperagen in #542
- make sure to adjust nullability in order to fix platform types by @koperagen in #552
- Fixes Api Guru examples by @Jolanrensen in #559
- generate "nullable properties" only where needed by @koperagen in #597
- Updating maven publication developers list by @Jolanrensen in #611
- Fix for #573: Change serialization format for rendering in IntelliJ IDEA by @ermolenkodev in #574
- Fix for mac-os permissions pre-commit by @Jolanrensen in #616
- prevent Klaxon from using default reflective serializer by @koperagen in #621
Docs and Examples
- update groupBy documentation by @koperagen in #465
- Docs updates by @koperagen in #545
- Update of the documentation for the 0.12.1 release by @zaleslaw in #556
- Documentation updates by @koperagen in #560
- migrate Titanic.ipynb to Kandy by @koperagen in #548
- fix broken links by @koperagen in #626
- Checking notebooks for 0.13.1 by @Jolanrensen in #625
- Fixes in resize-iframes.js by @Jolanrensen in #618
Version Updates
- Update simple-git version to 2.0.3 by @Jolanrensen in #535
- Updating docProcessor to 0.3.3 by @Jolanrensen in #600
- Ksp fix by @Jolanrensen in #592
- Version bumps by @Jolanrensen in #590
- Version bumping Dokka to 1.9.10 by @Jolanrensen in #599
- Updated MySQL and PostgreSQL versions in build config by @zaleslaw in #609
New Contributors
- @jmrsnt made their first contribution in #484
- @fb64 made their first contribution in #516
- @LeandroC89 made their first contribution in #157
Known Issues
- KDoc rendering looks odd on 2024 Nightly IntelliJ by @Jolanrensen in #585
- This is being worked on in IntelliJ, JetBrains Markdown, and here.
Full Changelog: v0.12.1...v0.13.1
0.12.1 Bug Fix: Improved SQL type to Kotlin types mapping
Read the JDBC support documentation
What's Changed
- Add tests for repeated read with limit from SQL table and ResultSet by @zaleslaw in #500
- Update Ktor dependency and adjust related imports by @zaleslaw in #492
- Add validation for SQL queries in DataFrame.readSqlQuery by @zaleslaw in #502
- Backport #501 to 0.12.1 by @ermolenkodev in #504
- Type mapping rewriting for JDBC integration by @zaleslaw in #505
- Fixing numerical types by @zaleslaw in #547
Full Changelog: 0.12...v0.12.1
0.12: SQL Databases as a Data Source via JDBC
Read the JDBC support documentation
What's Changed
- Added JDBC-integration by @zaleslaw in #451
- Add SQL database reading to documentation by @zaleslaw in #464
- Fixes the schema generation for JDBC integration by @zaleslaw in #470
- Predicate join operation by @koperagen in #434
- Improved rendering of types with arguments, like
Comparable<*>
by @Jolanrensen in #467 - Add the missing renderers for the bunch of intermediate objects by @ermolenkodev in #473
- Added updated gradle.properties version to RELEASE_CHECK_LIST by @Jolanrensen in #416
- Update gradle.properties version number to 0.12.0 by @Jolanrensen in #417
- Fixed rename behavior by @Jolanrensen in #419
- fix compilation for markers with empty bodies in Jupyter by @koperagen in #427
- Deprecation messages by @Jolanrensen in #430
- Temporary documentation sitemap solution by @Jolanrensen in #431
- DataColumn Sort with by @Jolanrensen in #425
- Versions update by @koperagen in #438
- Configure integrationTest using new notation instead of the deprecated one by @koperagen in #439
- Updating docs processor by @Jolanrensen in #440
- Docs update & DynamicDataFrameBuilder by @koperagen in #441
- pre-commit hook task fix by @Jolanrensen in #444
- Restructure Read operations page by @sarahhaggarty in #447
- Configure documentation by @koperagen in #453
- Public
empty(DataFrameSchema)
API by @koperagen in #452 - Fix samples tests compilation by @koperagen in #456
- [Fix] Linux multiple user permission to read xlsx files by @Jolanrensen in #472
Full Changelog: build-0.11.1...0.12
0.11.1
0.11.0 Onboarding documentation update and minor API changes
What's Changed
- Added the parent name in the flatten operation by @zaleslaw in #378
- Added required lowercase conversion for all the columns by @zaleslaw in #379
- Reworked of
dfs
functions to recursively (changing a return type of recurseable functions) by @Jolanrensen in #363 - Added release flag priority when choosing the version number by @koperagen in #382
- Added nullability support for
the describe()
by @zaleslaw in #384 - Added RELEASE_CHECK_LIST.md by @zaleslaw in #385
- Added some name repairing strategies by @zaleslaw in #386
- Added a link to the missed dataset titanic.csv by @zaleslaw in #389
- Fixed the link to the movie.csv by @zaleslaw in #390
- Extracted subchapters by @zaleslaw in #391
- Added a section about JDK and libraries versions to Readme.md by @zaleslaw in #392
- Fix missing articles and grammar in docs by @Jolanrensen in #398
- Split the Gradle and Kotlin build snippets for Android and Server-side by @zaleslaw in #400
- Added verification for Google search index by @Jolanrensen in #409
- Rewritten
gettingStarted
page by @zaleslaw in #407 - Updated
diff
function #339 by @koperagen in #410 - Added NA and NaN docs by @Jolanrensen in #411
- Fixed
withValue() -> with { value }
by @Jolanrensen in #403 - Fixed: Jupyter compile-time DF type not recognized by @Jolanrensen in #401
- Added row functions to "operation overview" to make it comprehensive by @koperagen in #412
Full Changelog: build-0.10.1...build-0.11.0
0.10.1 Bug fix: Android compatibility and KSP updates
What's Changed
- Avoiding to use the URL-encoded path by @Kantis in #357
- Avoiding to use the URL-encoded path - part 2 by @Jolanrensen in #360
- Parse String column to nullable Double with correct locale by @Kopilov in #226
- Migrate visualization in examples from lets-plot to kandy by @devcrocod in #359
- Variable width vector fix by @Kopilov in #350
- Configure writerside to include tables by @koperagen in #324
- update to hotfix KSP release by @koperagen in #366
- Gradle plugin references fix by @koperagen in #365
- Fixed formatting by @zaleslaw in #367
- Korro outputs by @koperagen in #370
- Android compatibility by @Jolanrensen in #371
- prevent CI/CD from publishing artifacts where version = build number by @koperagen in #377
New Contributors
Full Changelog: build-0.10.0...build-0.10.1
Dataframe 0.10.0
New version targeting Kotlin 1.8.20 and KSP 1.8.20-1.0.10
KDocs were introduced in many places, so check them out in the IDE!
Along with that, now you can see the result of most operations in the documentation in the form of interactive tables. Now it should be much clearer what is going on even for relatively complex operations such as pivot.
Known issues
There's an issue with incremental compilation in the KSP 1.8.20-1.0.10 that sometimes leads to build errors when using our Gradle plugin. If you are experiencing this problem, try disabling incremental compilation or stick to some older version, for example 0.10.0-dev-1532
New API
Check out the updated dataframe rendering API if you want to customize your outputs in the notebook or want to save or display dataframes in HTML format.
Auto Generated What's Changed
- Describe how to control content cell limit from Jupyter Notebook by @alllex in #220
- Update documentation version to 0.9 by @Adriankhl in #223
- Restore additional conversions from ValueColumn/ColumnGroup to FrameColumn by @nikitinas in #213
- Add nestedRowsLimit parameter to DisplayConfiguration by @pacher in #221
- Fix new columns names generation during split operation (#224) by @ermolenkodev in #240
- Cleaned up Access APIs docs by @Jolanrensen in #237
- Enable multi-round DataSchema declaration processing (#140) by @ermolenkodev in #247
- Added a Contribution Guide by @zaleslaw in #250
- Documentation improvements by @koperagen in #252
- Upgraded KotlinDL version and refactor Titanic in IDEA example by @zaleslaw in #260
- Migrate on writerside by @lananovikova10 in #272
- Converted all the data type mentions to the links on the mentioned types by @zaleslaw in #273
- Setup documentation CI/CD by @koperagen in #275
- setup algolia indexes update by @koperagen in #276
- fix artifact name by @koperagen in #277
- Converted all the types mentions to the links on the mentioned types pages by @zaleslaw in #278
- fixed 1password openapi example by @Jolanrensen in #285
- [Fix] Update.asFrame now takes filter into account. by @Jolanrensen in #283
- Rename Column typealias to AnyColumnReference by @Jolanrensen in #279
- explicitly state the idea behind navigation structure by @koperagen in #258
- Update Kotlin to 1.8.10. by @cmelchior in #294
- Add support for swing tables integration in IDEA plugin by @ermolenkodev in #290
- Hint about Kotlin Spark API in I/O docs by @Jolanrensen in #298
- Update Get started page by @sarahhaggarty in #293
- Reduced number of top-level infix functions by @Jolanrensen in #299
- [Bug fix] "SyntaxError: Invalid hexadecimal escape sequence" for rendering DataFrames with "" in content by @Jolanrensen in #303
- Added 3 APIs for Flatten function by @zaleslaw in #307
- fixed rowColExpression type in docs by @Jolanrensen in #301
- Added
dropNaNs
function overloads by @Jolanrensen in #305 - Set logging to QUIET for generateDataFrameXXX tasks by @Jolanrensen in #311
- Enable Doc processor plugin by @Jolanrensen in #308
- Overload toDataFrame for basic types to avoid surprising results by @koperagen in #314
- Fix .editorconfig to make IDEA not add final newlines to the notebooks by @ileasile in #323
- KDocs start using Doc Preprocessor Gradle plugin by @Jolanrensen in #214
- Change search application and index names by @lananovikova10 in #327
- Deprecation of iterable in the API with the addition of new methods by @zaleslaw in #320
- Provide extra pointers to operations with DataFrame type argument by @koperagen in #321
- Windows build fix by @Jolanrensen in #329
- Improve experience with HTML rendering of dataframes by @koperagen in #300
- lower max memory by @koperagen in #330
- Column Selection DSL improvements by @Jolanrensen in #318
- Version bumps and dependency fixes by @Jolanrensen in #343
- Columns Selection DSL KDocs and missing API overloads by @Jolanrensen in #331
New Contributors
- @alllex made their first contribution in #220
- @Adriankhl made their first contribution in #223
- @pacher made their first contribution in #221
- @ermolenkodev made their first contribution in #240
- @zaleslaw made their first contribution in #250
- @lananovikova10 made their first contribution in #272
- @cmelchior made their first contribution in #294
- @sarahhaggarty made their first contribution in #293
Full Changelog: build-0.9.1...build-0.10.0
Dataframe 0.9.1
Kotlin Dataframe 0.9.1 released!
Blog post: https://blog.jetbrains.com/kotlin/2023/01/kotlin-dataframe-0-9-1-released/
Get it on Maven Central!
TL;DR
- OpenAPI type schemas can now be parsed and converted into data schemas.
Read about it in the blog post. - New JSON reading options include type clash tactics and key/value paths.
Read about it in the blog post. - Support for writing Apache Arrow files has been added.
Read about it in the blog post. - Many bugs have been fixed and some new stuff has been added.
Read about it in the blog post. - Make sure to update your Kotlin Jupyter kernel if you use DataFrame there.
Auto Generated What's Changed
- Fixed typo by @njacobs5074 in #123
- Expand Arrow reading support by @Kopilov in #129
- #132 Add
skipRows
parameter to enable reading data with header from… by @koperagen in #135 - Consider skipRows when obtaining column indexes #132 by @koperagen in #137
- More Converting operations by @Kopilov in #133
- Fix examples by @Jolanrensen in #146
- use index math for skip rows #132 by @koperagen in #153
- Setup Kover by @koperagen in #154
- Publish KSP as fat jar to avoid transitive dependency on dataframe by @koperagen in #164
- fix BindException on CI by @koperagen in #166
- Merge core and tests modules by @koperagen in #165
- Arrow writing by @Kopilov in #162
- Revert "Arrow writing" by @koperagen in #167
- Fix problem with big arrow by @ileasile in #163
- generate a new marker when dataframe marker is not a data schema by @koperagen in #168
- Add a way of explicit switching to dark color scheme. by @ileasile in #174
- Nullable accessors and fixed convertTo by @Jolanrensen in #175
- updating to kotlin 1.7.20 by @Jolanrensen in #180
- add delimiter option for solving read error by @LeeMH in #183
- Converting to date, Cell conversion exception by @Kopilov in #187
- OpenAPI/Swagger JSON type schema support + many small fixes I came across by @Jolanrensen in #173
- update csv read docs with ways to read locale specific numbers by @koperagen in #189
- CellConversionException in DataColumn<String?>.convertToDouble(locale) by @Kopilov in #190
- Fix for generic type to Any? erasure in new column creation by @Jolanrensen in #192
- Absolute path fixes in plugins by @Jolanrensen in #191
- fix for "Jupyter codegen: attempt to inherit from data class" by @Jolanrensen in #193
- Arrow writing by @Kopilov in #169
- Write nulls from columns to JSON explicitly by @koperagen in #195
- support applying column operations on row expression result in AddDsl by @koperagen in #196
- Backtick name bug by @Jolanrensen in #199
- Dataschema generic inheritance by @Jolanrensen in #200
- Add emptyDataFrame to documentation by @matthewwiese in #201
- Jupyter codegen isOpen both for var/val by @Jolanrensen in #202
- Sort grouped df by @Jolanrensen in #204
- Bugfix for NPE with fillNulls by @Jolanrensen in #205
- Support custom filling of missing columns in
convertTo
by @nikitinas in #207 - Added docs and clarifications to functional typealiases by @Jolanrensen in #203
- Min kernel version fix by @Jolanrensen in #209
- Don't print header record in csv if requested by @vhuc in #211
- read -> unfold + Add documentation for unfold #159 by @koperagen in #206
New Contributors
- @njacobs5074 made their first contribution in #123
- @LeeMH made their first contribution in #183
- @matthewwiese made their first contribution in #201
- @vhuc made their first contribution in #211
Full Changelog: build-0.8.0...build-0.9.1