-
Couldn't load subscription status.
- Fork 244
Orthographic #1135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Orthographic #1135
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't check the math @timoore, but at a high level this looks great! Just minor code nitpicks in the review below. I'd like to play with this a bit in Unreal (which does have orthographic viewports) before merging it just to get a feel for the SSE behavior, but I love how you've approached this!
Cesium3DTilesSelection/include/Cesium3DTilesSelection/ViewState.h
Outdated
Show resolved
Hide resolved
Cesium3DTilesSelection/include/Cesium3DTilesSelection/ViewState.h
Outdated
Show resolved
Hide resolved
Cesium3DTilesSelection/include/Cesium3DTilesSelection/ViewState.h
Outdated
Show resolved
Hide resolved
|
Unity might also be a good test too, since orthographic mode just requires a bool: https://docs.unity3d.com/6000.0/Documentation/ScriptReference/Camera-orthographic.html |
Caught by clang-tidy
Ellucidate the method of extracting clipping planes from a projection matrix.
|
This should be ready for final review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good @timoore! I just have the one concern below.
| // quite right: the distance is actually the slant distance, and the real | ||
| // transform contains a term for an offset due to a skewed projection which | ||
| // is ignored here. | ||
| const glm::dmat4& projMat = this->_projectionMatrix; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is called hundreds of times per frame, and it has gotten a lot more expensive in this PR. Fundamentally, I think it should still just be a linear function of geometricError / distance, right? Can we go back to computing something like the SSE denominator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is called hundreds of times per frame, and it has gotten a lot more expensive in this PR. Fundamentally, I think it should still just be a linear function of
geometricError / distance, right? Can we go back to computing something like the SSE denominator?
The screen space error is not a function of distance for orthographic projections and is a bit messy for skewed perspective projections. In its current form this function doesn't need special logic for those cases; they are just a property of the projection matrix.
I am going to go out on a limb and assert that the performance of this function is dominated by the cost of floating point division. Before, this function had one division. At first glance the new version does 8, but only 2 of the results are actually used. Looking at the emitted assembly code with -O2, it looks like gcc does a good job of eliminating unused values, to the point that it only does 2 divisions and 10 multiplications. In other words, the dot products that would produce x and z values of the projection aren't done, and neither are the multiplications by 0.0 and 1.0.
It could be more clear to write the code so that it only does the operations that are needed and leave the more general math in comments, but I don't think that's a blocker?
I notice that the comment isn't quite correct and I will correct it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't see any evidence MSVC is optimizing this in a RelWithDebInfo build. The disassembly of this function looks like this:
double ViewState::computeScreenSpaceError(
double geometricError,
double distance) const noexcept {
00007FFFFADCBC96 mov rdi,rcx
// Avoid divide by zero when viewer is inside the tile
distance = glm::max(distance, 1e-7);
00007FFFFADCBC99 maxsd xmm7,xmm2
00007FFFFADCBC9D movaps xmmword ptr [rax-38h],xmm8
// This is a simplified version of the projection transform and homogeneous
// division. We transform the coordinate (0.0, geometric error, -distance,
// 1) and use the resulting NDC to find the screen space error. That's not
// quite right: the distance argument is actually the slant distance to the
// tile, whereas view-space Z is perpendicular to the near plane.
const glm::dmat4& projMat = this->_projectionMatrix;
glm::dvec4 centerNdc = projMat * glm::dvec4(0.0, 0.0, -distance, 1.0);
00007FFFFADCBCA2 movups xmmword ptr [rsp+20h],xmm0
00007FFFFADCBCA7 lea rcx,[rax-58h]
00007FFFFADCBCAB movaps xmm8,xmm1
00007FFFFADCBCAF xorps xmm7,xmmword ptr [__xmm@80000000000000008000000000000000 (07FFFFBA9B580h)]
00007FFFFADCBCB6 movsd mmword ptr [rsp+30h],xmm7
00007FFFFADCBCBC movsd mmword ptr [rax-80h],xmm6
00007FFFFADCBCC1 call glm::operator*<double,0> (07FFFFA32C460h)
00007FFFFADCBCC6 xorps xmm0,xmm0
glm::dvec4 errorOffsetNdc =
projMat * glm::dvec4(0.0, geometricError, -distance, 1.0);
00007FFFFADCBCC9 movsd mmword ptr [rsp+28h],xmm8
00007FFFFADCBCD0 lea r8,[rsp+20h]
00007FFFFADCBCD5 movsd mmword ptr [rsp+20h],xmm0
00007FFFFADCBCDB lea rdx,[rdi+1C8h]
00007FFFFADCBCE2 movsd mmword ptr [rsp+30h],xmm7
00007FFFFADCBCE8 lea rcx,[errorOffsetNdc]
00007FFFFADCBCED movsd mmword ptr [rsp+38h],xmm6
00007FFFFADCBCF3 call glm::operator*<double,0> (07FFFFA32C460h)
errorOffsetNdc /= errorOffsetNdc.w;
00007FFFFADCBCF8 movsd xmm0,mmword ptr [rsp+48h]
double ndcError = (errorOffsetNdc - centerNdc).y;
// ndc bounds are [-1.0, 1.0]. Our projection matrix has the top of the
// screen at -1.0, so ndcError is negative.
return -ndcError * this->_viewportSize.y / 2.0;
}
00007FFFFADCBCFE lea r11,[rsp+0B0h]
errorOffsetNdc /= errorOffsetNdc.w;
00007FFFFADCBD06 divsd xmm0,mmword ptr [rsp+58h]
double ndcError = (errorOffsetNdc - centerNdc).y;
// ndc bounds are [-1.0, 1.0]. Our projection matrix has the top of the
// screen at -1.0, so ndcError is negative.
return -ndcError * this->_viewportSize.y / 2.0;
}
00007FFFFADCBD0C mov rbx,qword ptr [r11+10h]
centerNdc /= centerNdc.w;
00007FFFFADCBD10 movsd xmm1,mmword ptr [rsp+68h]
00007FFFFADCBD16 divsd xmm1,mmword ptr [rsp+78h]
double ndcError = (errorOffsetNdc - centerNdc).y;
// ndc bounds are [-1.0, 1.0]. Our projection matrix has the top of the
// screen at -1.0, so ndcError is negative.
return -ndcError * this->_viewportSize.y / 2.0;
}
00007FFFFADCBD1C movaps xmm6,xmmword ptr [r11-10h]
00007FFFFADCBD21 movaps xmm7,xmmword ptr [r11-20h]
errorOffsetNdc /= errorOffsetNdc.w;
00007FFFFADCBD26 subsd xmm0,xmm1
double ndcError = (errorOffsetNdc - centerNdc).y;
// ndc bounds are [-1.0, 1.0]. Our projection matrix has the top of the
// screen at -1.0, so ndcError is negative.
return -ndcError * this->_viewportSize.y / 2.0;
}
00007FFFFADCBD2A movaps xmm8,xmmword ptr [r11-30h]
00007FFFFADCBD2F xorps xmm0,xmmword ptr [__xmm@80000000000000008000000000000000 (07FFFFBA9B580h)]
00007FFFFADCBD36 mulsd xmm0,mmword ptr [rdi+38h]
00007FFFFADCBD3B mulsd xmm0,mmword ptr [__real@3fe0000000000000 (07FFFFBB4F748h)]
00007FFFFADCBD43 mov rsp,r11
00007FFFFADCBD46 pop rdi
00007FFFFADCBD47 ret
The important bit in my mind is those two calls to glm::operator*. If it's calling the mat4 * vec4 multiplication operator (twice) then that is a bunch of extra floating point operations that a) weren't done before, and b) shouldn't be necessary.
On the other hand, I can't measure any particular overhead from this, at least on my desktop on Windows. I'm still a little worried about low powered handheld and AR/VR devices, but it's probably true that there are bigger fish to fry. 🐟🍳
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took screenshots from the same viewpoint in main and in this PR, and flipped between them, and I can't see any difference in LOD whatsoever. So no issues there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears that MSVC isn't inliining the matrix multiply, which is unfortunate; if those calls are inlinined, then it's a pretty basic optimization to notice that results are unused and eliminate their computation. MSVC has managed to do that for the vector divide-by-scalar operations.
|
Thanks @timoore! |
Update Cesium3DTilesSelection::ViewState to take view and general projection matrix arguments. This supports orthographic projection.
Although there are tests for the new support code for creating projection matrices and CullVolume objects, there isn't a test yet for tile selection using an orthographic matrix, so this is a draft pull request for now.