Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify line measures #1216

Merged
merged 11 commits into from
Oct 8, 2024
Merged

Unify line measures #1216

merged 11 commits into from
Oct 8, 2024

Conversation

michaelkirk
Copy link
Member

@michaelkirk michaelkirk commented Sep 25, 2024

  • I agree to follow the project's code of conduct.
  • I added an entry to CHANGES.md if knowledge of this change could be valuable to users.

All of these algorithm implementations already existed, but I'm
proposing a new way to organize them which I hope will be more
consistent and discoverable.

Note: The new Haversine::bearing and Geodesic::bearing return 0..360,
the legacy traits HaversineBearing and GeodesicBearing returned
-180..180

Additional changes:

Deleted the deprecated Bearing trait which was previously superceeded
by the unambiguous HaversineBearing trait, but now we're reusing Bearing as the unified trait for all metric spaces (so you can call Haversine::bearing, Geodesic::bearing, etc.)

Future Work

In an effort to minimize this PR, while keeping the change reasonably
coherent, I've left some things out that I'd like to follow up with in future PRs

Methods on Euclidean

  • bearing (doesn't currently exist)
  • destination (doesn't currently exist)
  • intermediate (exists, but InterpolatePoint trait also needs intermediate_fill)
  • intermediate_fill (doesn't currently exist)

Deprecate Legacy Traits

  • Deprecate Legacy impls
  • Switcheroo the actual implementation: move the actual implementation to the new traits, and have the legacy traits delegate to the new traits.
  • Move over any tests from the legacy implementation to the new home

Methods on Line/LineString/Polygon (and multi)

  • Length
  • Densify

FIXES #1210
FIXES #1181

All of these algorithm implementations already existed, but I'm
proposing a new way to organize them which I hope will be more
consistent and discoverable.

Note: The new Haversine::bearing and Geodesic::bearing return 0..360,
the legacy traits HaversineBearing and GeodesicBearing returned
-180..180

Additional changes:

Deleted the deprecated `Bearing` trait which was previously superceeded
by the unambiguous `HaversineBearing` trait, but now is re-defined as
`Haversine::bearing`

= Future Work =

In an effort to minimize this PR, while keeping the change reasonably
coherent, I've left some things out

== Methods on Euclidean ==
 -[ ] bearing (doesn't currently exist)
 -[ ] destination (doesn't currently exist)
 -[ ] intermediate (exists, but InterpolatePoint trait also needs intermediate_fill)
 -[ ] intermediate_fill (doesn't currently exist)

== Deprecate Legacy Traits ==
 -[ ] Deprecate Legacy impls
 -[ ] Switcheroo the actual implementation: move the actual implementation to the new traits, and have the legacy traits delegate to the new traits.
 -[ ] Move over any tests from the legacy implementation to the new home

== Methods on Geoms (Future PR) ==
 -[ ] Length
     -[ ] Haversine
     -[ ] Rhumb
     -[ ] Geodesic
     -[ ] Euclidean
 -[ ] Densify
     -[ ] Haversine
     -[ ] Rhumb
     -[ ] Geodesic
     -[ ] Euclidean

FIXES #1210
FIXES #1181
This is needed for the "Unify line measures" work.
@michaelkirk
Copy link
Member Author

Note I've also bumped our MSRV to 1.75 to take advantage of "return-position impl Trait in trait" which is used in the new trait definitions.

@michaelkirk
Copy link
Member Author

Oh right, we made some changes to our ci builds. I'll follow up with another PR to adopt that before this can pass.

Copy link
Member

@urschrei urschrei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two comments. I know that they appear negative, but I think that the vast majority of this PR is useful and clarifying.


/// Euclidean space measures distance with the pythagorean formula - what you'd measure with a ruler.
///
/// You must use projected coordinates with Euclidean space —
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know some people have complained about excessively technical language (I disagree), but I'm uneasy about using terminology like "ruler" or "projected coordinates". I think we should refer to this space and link to it as The Euclidean Plane, specifically "the set R2 of the ordered pairs of real numbers, equipped with the dot product", because it's unambiguous. We could also provide examples of what it isn't (I'm not particularly in favour, but I can see why it might be helpful).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that euclidean plane link - I'll work that in.

I think that your very technical description will be, on average, more confusing for someone (like me) who just wants "regular old normal measurements like I learned to do in middle school geometry".

My impression is that you are trying to target a more mathematically adroit user with your precise definition, which is a reasonable segment to target, but at the cost of confusing users like me... I'm not sure it's worth it.

Is there plausible confusion here? What other metric space would people think we're talking about?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Phrased a little differently:

I think the most likely mistake people will make is trying to put GPS coordinates into a euclidean method or vice-versa.

So rather than giving a complete dictionary definition of what a thing is I'm optimizing to have documentation that I think will lead to the fewest number of errors based on peoples short attention spans.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the most likely mistake people will make is trying to put GPS coordinates into a euclidean method or vice-versa.

I agree, but I also want to avoid using language that's going to be confusing to people who aren't familiar with coordinate systems. Fundamentally, we want people to know what the rules of the system they using are (because we are declining to enforce the rules using the type system)

I want people to know they're in The Euclidean Plane, because that's unambiguous information that they can expand upon if they're unsure, and it defines the rules that are in use w/r/t/ distance etc.

For people who are implementing algorithms using geo, they will want to know they're in R2 because the literature that you have to read to implement them often states this as an assumption.

So I would say I don't think we disagree about who we're targeting (the widest possible range of users).

///
/// # Units
/// - `origin`, `destination`: Point where the units of x/y represent non-angular units
/// — e.g. meters or miles, not lon/lat. For lon/lat points, use the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a similar vein, I don't think we should be referring to "meters or miles" here: non-Euclidean distances are often measured in meters / miles, so this is confusing. "Not lon/lat" is OK, but I'd prefer that this is either implicit (you can't measure lon / lat distances in Euclidean space, and we are in Euclidean space, therefore…) or that we decline to explain with physical-world examples completely.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-Euclidean distances are often measured in meters / miles, so this is confusing.

I believe you, but I honestly didn't know that was true. Can you give me an example?

you can't measure lon / lat distances in Euclidean space

Oh but you can! And it's a common error that gives you a meaningless result. Look no further than our own example code on the legacy euclidean distance:

let p1 = point!(x: -72.1235, y: 42.3521);
let p2 = point!(x: -72.1260, y: 42.45);
let distance = p1.euclidean_distance(&p2);

assert_relative_eq!(distance, 0.09793191512474639);

I think it's important to keep simple relatable language and examples to help mitigate that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you, but I honestly didn't know that was true. Can you give me an example?

Any physical measurement of distance on the earth's surface (or a simplified abstraction of it) is definitionally non-Euclidean because it's a curved surface.

Copy link
Member Author

@michaelkirk michaelkirk Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, that makes sense. Thank you for bearing with me. I don't have a formal GIS background, so a lot of things that may seem obvious, I truly don't know.

I've included the link to [Euclidean Plane], which I hope addresses your ambiguity concern.

I've also linked to the Transform trait in reference to what we mean by "projection". While also not 100% precise, I think that it will be helpful to those who know a little, without confusing those who know a lot.

I still think it's important that our documentation aim to help people avoid the most likely errors by saying something like:

this method takes lng/lat

But the negation:

this method does not take lng/lat

Feels awkward and unnecessarily mysterious without an example. I now understand how Euclidean space is not the only space that could use meters as a unit, but as an example of what we mean by "not lng/lat", I think it is clarifying, and unlikely to confuse.

So if I've said something that is incorrect, let me know and I'll try again to fix it. But if it's more so that you think it reads like it was written for dummies, well then I'd please beg of you to have mercy on us dummies.

@michaelkirk
Copy link
Member Author

Oh right, we made some changes to our ci builds. I'll follow up with another PR to adopt that before this can pass.

I forgot that @urschrei already updated our ci to the new system! So I just had to publish a container for the new MSRV.

/// If you have lon/lat points, use the [`Haversine`], [`Geodesic`], or other [metric spaces] -
/// Euclidean methods will give nonsense results.
///
/// Alternatively, you *can* use lon/lat points with Euclidean methods if you first [`Transform`]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the point you're trying to make, but if you transform lon/lat coordinates they're not lon/lat coordinates anymore. Could we say something like "If you wish to use Euclidean operations with lon/lat coordinates these must first be be transformed using the transform / transform_crs_to_crs methods or their immutable variants. Use of these requires the proj feature"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with your phrasing — I've updated it.

For my own understanding, are you trying to avoid the usage of the term "projection"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not consciously, but yes: If you're working with non-Euclidean coordinates, you're probably not that interested in projection. If you were, you wouldn't be using measures and algorithms that are slower and more complex than their 2D counterparts.

This is all part of a larger endeavour, in the sense that geo is trying to be a practical tool with which to do real work. A great deal of that work can be done by transforming geographic coordinates into geodetic coordinates, giving you access to the simpler, more robust abstractions in the Euclidean plane.

But we also know (because the abstractions and algorithms now exist in geo because people have asked for / contributed them) that some work is more suited to spherical geometry.

As it stands, I feel like our docs and the way that some of our algorithms are organised don't demarcate that relationship very well; In my mind, we're "Euclidean and Cartesian by default, but if you need spherical geometry and algorithms we've got some of those".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my mind, we're "Euclidean and Cartesian by default, but if you need spherical geometry and algorithms we've got some of those".

Yeah, I think that accurately reflects georust/geo. I think it also pretty accurately reflects most historical GIS software.

This is all part of a larger endeavour, in the sense that geo is trying to be a practical tool with which to do real work.

💯 ❤️

If you were [interested in projection], you wouldn't be using measures and algorithms that are slower and more complex than their 2D counterparts.

With "The Web" as a publishing platform (e.g. webmaps, geojson) and more global datasets like OSM, using lon/lat coordinates in algorithms is increasingly not a conscious decision of the analyst/cartographer/person as "the best", but rather just happens to be the format that their source data is in, or that their presentation layer demands.

Having talked through this a bit, I think that's the group I'm trying to connect with, and often am implicitly biased towards in my documentation and API design.

michaelkirk referenced this pull request Sep 30, 2024
…trary earth radius

Similarly, the Geodesic calculations could take an arbitrary geod as an argument
rather than Geodesic::wgs84, while the default would still delegate to
the wgs84 instance.
@michaelkirk
Copy link
Member Author

Would you like to see further changes here @urschrei?

If so, that's OK, but could you please provide some specific language for the areas you're still concerned about?

@michaelkirk
Copy link
Member Author

@urschrei - Do you think you'll have time to review the doc fixups you requested, or would you prefer that I start soliciting other reviews?

Copy link
Member

@urschrei urschrei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@urschrei
Copy link
Member

urschrei commented Oct 7, 2024

(Upon reflection, maybe the best place for technical discussion of metric spaces etc is a "developer guide" which interested parties can consult when implementing algorithms, and which could be linked to from our PR template.)

@michaelkirk
Copy link
Member Author

michaelkirk commented Oct 7, 2024

Awesome, thank you. I'll merge this tomorrow unless I hear otherwise.

Maybe the best place for technical discussion of metric spaces etc is a "developer guide" which interested parties can consult when implementing algorithms, and which could be linked to from our PR template.

That might be helpful. If nothing else it'd at least be a consistent place to point people to if there's confusion.

The PR template is visible to people when they open their PR, which they presumably do only once they are finished with their work. At that point telling them "here's what you need to know about metric spaces to do your work" feels late.

We could link to it in https://github.com/georust/geo/blob/main/CONTRIBUTING.md, or maybe it could start as just a paragraph within that document? I'm sure lots of people skip that too, but 🤷

edit: to clarify, I'm also fine to have it in the PR template, but think it should also appear somewhere more proactive.

@urschrei
Copy link
Member

urschrei commented Oct 7, 2024

The template-too-late issue had occurred to me. As you say, it's hard to know where people start anyway, so we should probably link from multiple places, as you say.

This was a copy/paste error from the impl on Geodesic which truly does
only support f64 - for Haversine and Rhumb though, we can support any
CoordFloat+FromPrimitive
@michaelkirk
Copy link
Member Author

michaelkirk commented Oct 7, 2024

Just for the sake of transparency, I snuck in two more commits that I don't think will be controversial - a formatting fixup and I relaxed a trait impl from f64 to CoordFloat (it was a copy/paste error).

@michaelkirk michaelkirk added this pull request to the merge queue Oct 8, 2024
Merged via the queue into main with commit ccf2ffd Oct 8, 2024
15 checks passed
@michaelkirk michaelkirk deleted the mkirk/line-measures-3 branch October 8, 2024 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unify line measurement traits (tracking issue) Clarification Needed on Bearing Output Range
2 participants