Skip to content

Conversation

@lightningterror
Copy link
Contributor

@lightningterror lightningterror commented Sep 30, 2025

Description of Changes

GS/TC: Optimize block offset calculation.
Swap offset x and y loops:
Having the y loop first allows for better optimizations/caching since it's bigger. Also optimize start point loop conditions for target rect.

Rationale behind Changes

Speed, optimizations.
Fixes #13338

Suggested Testing Steps

Test #13338
Test GTA Liberty City stories, dump on it goes from 43 fps to 135 fps.
Test other games listed in #12794

Did you use AI to help find, test, or implement this issue or feature?

No.

Swap offset x and y loops:
Having the y loop first allows for better optimizations/caching since it's bigger.
Also optimize start point loop conditions for target rect.
Copy link
Member

@JordanTheToaster JordanTheToaster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appears to work correctly and brings GTA LCS FPS back up above what it was before the regression.

Image

@lightningterror lightningterror merged commit e550cf9 into master Sep 30, 2025
22 checks passed
@lightningterror lightningterror deleted the gs_block_offset_optimizations branch September 30, 2025 20:01
@lightningterror
Copy link
Contributor Author

Posting Jordan's results, will be useful for progress report:

Call of duty 2 180 fps to 194 fps
Final Fantasy X 720 fps to 830 fps
GH Van Halen at native 188 fps before 201 fps after
Gun 332 fps before 456 fps after
dot hack infection part 1 619 fps before 912 fps after
Haven call of the king 620 fps to 660 fps
Kung fu panda 415 to 436
LA rush with tex in rt 174 to 180
Midnight club 3 dub remix 290 before 402 after
MLB 11 The Show 639 before 858 after
Need for speed carbon 138 fps before 144 fps after
Project Snowbling 578 fps before 680 fps after
Sakura Taisen - Atsuki Chishioni 983 fps before 1107 fps after
SOTC 187 fps before 206 fps after
Test drive eve of destruction 331 fps before 378 fps after
The god farther 770 fps before 875 fps after

Maybe someone can make some graphs which would be nice.

@lightningterror lightningterror changed the title GS/TC: Optimize block offset calculation. GS/TC: Optimize block offset calculations. Oct 4, 2025
Hancock33 added a commit to Hancock33/batocera.piboy that referenced this pull request Oct 5, 2025
------------------------------------------------------------------------------------------------------
chromebook-linux-audio.mk 90d29575f479b4fbef1d152a9739de5248c30d06 # Version: Commits on Sept 29, 2025
------------------------------------------------------------------------------------------------------
adl: install fixed rt1019-rt5682 tplg,

--------------------------------------------------
faudio.mk 25.10 # Version: Commits on Oct 01, 2025
--------------------------------------------------
Minor improvements to FAudio CI. 25.10 is functionally identical to 25.09.

Thanks to our [GitHub Sponsors](https://github.com/sponsors/flibitijibibo/), including...

Super Duper Sponsors:

- [Re-Logic](https://re-logic.com/)

Super Sponsors:

- @CDGKen

- @compcj

- @jbevain

- @kg

- @NoelFB

- @superjoebob

- @terinfire

- @TerryCavanagh

Sponsors:

- @bartwe

- @bwiklund

- @Conan-Kudo

- @Eldirans

- @GlaireDaggers

- @isaboll1

- @isadorasophia

- @larsiusprime

- @tgpholly

- @xxxbxxx

- [Bit Kid Games](http://bitkidgames.com/)

- [Lunar Ray Games](http://www.lunarraygames.com/),

----------------------------------------------------------------------------------------
amiberry.mk 452783ae8c94a67984b664e9b24f0b99ad163dcd # Version: Commits on Sept 29, 2025
----------------------------------------------------------------------------------------
ci: set least-privilege permissions,

----------------------------------------------------------------------------------------
applewin.mk e6201e04d34658299f63116751ba6b91fea10a7a # Version: Commits on Sept 28, 2025
----------------------------------------------------------------------------------------
Merge pull request #316 from AppleWin/master

Update,

-----------------------------------------------------------------------------------
ares.mk 18500e679bbe01464c22e49ea9ac3cfbac7a02e4 # Version: Commits on Oct 01, 2025
-----------------------------------------------------------------------------------
ld: Fixed a potential threading issue with frame prefetch,

----------------------------------------------------------------------------------
clk.mk 009f71a1866a291d88b6fe5dd41ace4fe485bfd3 # Version: Commits on Oct 01, 2025
----------------------------------------------------------------------------------
Update version number.,

------------------------------------------------------------------------------------------
dolphin-emu.mk a570b24c96128e5d342d9770fdbc1cf22f16ab89 # Version: Commits on Oct 01, 2025
------------------------------------------------------------------------------------------
Merge pull request #13985 from Dentomologist/jit64_fix_dcbz_regression

Jit64: Fix dcbz regression,

-------------------------------------------------------------------------------------------
duckstation.mk c79097226501ee5406dd950fac94bd41b8cde030 # Version: Commits on Sept 30, 2025
-------------------------------------------------------------------------------------------
System: Warn if geometry tolerance is not default,

-----------------------------------------------------------------------------------
eden.mk 4be6d30cd95634777e0ea0790d7c51a3d09bb773 # Version: Commits on Oct 01, 2025
-----------------------------------------------------------------------------------
[fixup] fix bad variable names (#2642)

--------------------------------------------------------------------------------------
hatari.mk 2cae7647a8c2a0343f3f2910023201475dd565ab # Version: Commits on Sept 28, 2025
--------------------------------------------------------------------------------------
Limit DMA sector count on Falcon to 14 bit (like on real hardware),

-------------------------------------------------------------------------------------
ikemen.mk 96eae81af39d1298f942cf3e2d711bb6e3ea768a # Version: Commits on Oct 01, 2025
-------------------------------------------------------------------------------------
style: fix code style issues with gofmt,

------------------------------------------------------------------------------------------
lightspark.mk cebd94d9829a6b4467df22b93bf72ce445343242 # Version: Commits on Sept 30, 2025
------------------------------------------------------------------------------------------
[test-runner] Use the new filesystem library, instead of `std::filesystem`

This replaces all uses of `std::filesystem` with the newly added

filesystem library.

It also makes some parts of the code simpler.,

------------------------------------------------------------------------------------------------
lindbergh-loader.mk 813ea1c49d4b96925c06c8f1b4caf9e8a749b405 # Version: Commits on Sept 30, 2025
------------------------------------------------------------------------------------------------
A few bugs fixed and 2 additions.,

---------------------------------------------------------------------------------------
melonds.mk f143e89c931d12a234851e443b7d85f8017db9a3 # Version: Commits on Sept 29, 2025
---------------------------------------------------------------------------------------
fix UB in software renderer,

---------------------------------------------------------------------------------------
openmsx.mk b2947c5237b569642f85ca283fac37f02c061dbf # Version: Commits on Sept 29, 2025
---------------------------------------------------------------------------------------
Simplify Shader API

A load/compile error in the Shader constructor now throws an exception

(before it only printed a warning). That means that now, after the

constructor finishes we have guaranteed a correct Shader object. That

allows to remove some checks and simplify the API.,

-----------------------------------------------------
pcsx2.mk v2.5.195 # Version: Commits on Sept 30, 2025
-----------------------------------------------------
- [GS/TC: Optimize block offset calculation.](PCSX2/pcsx2#13339)

,

------------------------------------------------------------------------------------
play.mk 29bf47316ae409099a36ea4cae084d9574e95dd1 # Version: Commits on Sept 28, 2025
------------------------------------------------------------------------------------
Add patch to batlgr3 to fix hanging in 3D scenes.,

--------------------------------------------------------------------------------------
ppsspp.mk 2f4b1adc98d36a4d3fdd0a413d65a7a0b306ed4c # Version: Commits on Sept 30, 2025
--------------------------------------------------------------------------------------
Merge pull request #20845 from hrydgard/checkbox-fix

Fix the OnClick behavior on checkboxes, oops. Fixes cheats.,

-------------------------------------------------------------------------------------
rpcs3.mk 23b339d410fafc21326502d381b4b03611fa294b # Version: Commits on Sept 30, 2025
-------------------------------------------------------------------------------------
rpcs3_version: Bump to 0.0.38,

---------------------------------------------------------------
ruffle.mk nightly-2025-10-01 # Version: Commits on Oct 01, 2025
---------------------------------------------------------------
## What's Changed

* build(deps): bump the cargo-minor group with 7 updates by @dependabot[bot] in ruffle-rs/ruffle#21806

* web: Add 'peer: true' to some dependencies by @tsunamistate in ruffle-rs/ruffle#21801

* build(deps-dev): bump the npm-minor group across 1 directory with 19 updates by @dependabot[bot] in ruffle-rs/ruffle#21809

* avm1: Do not reference trivially copyable objects by @kjarosh in ruffle-rs/ruffle#21721

**Full Changelog**: ruffle-rs/ruffle@nightly-2025-09-29...nightly-2025-10-01,

-----------------------------------------------------
ryujinx.mk 1.3.146 # Version: Commits on Oct 01, 2025
-----------------------------------------------------
Canary-1.3.146

----------------------------------------------------------------------------------------
thextech.mk 166402a2ab7f57d01450b77a8356f88edb2be25d # Version: Commits on Sept 30, 2025
----------------------------------------------------------------------------------------
Translated using Weblate (Japanese)

Currently translated at 60.8% (414 of 680 strings)

Co-authored-by: 3UPPER <[email protected]>

Translate-URL: https://hosted.weblate.org/projects/thextech/engine-general/ja/

Translation: TheXTech Engine/Engine General,

---------------------------------------------------------------------------------------
tsugaru.mk a6cf19ae556db952fa0092976f810989e6678232 # Version: Commits on Sept 28, 2025
---------------------------------------------------------------------------------------
Updated CMake minimum version to 3.20 (to fix macOS build),

----------------------------------------------------
xemu.mk v0.8.106 # Version: Commits on Sept 29, 2025
----------------------------------------------------
,

--------------------------------------------------------------------------------------------
xenia-native.mk 21166607227a2809f6b5c72dec6457e41394431c # Version: Commits on Sept 30, 2025
--------------------------------------------------------------------------------------------
[UI] Added some missing glyphs that might be used in text,

-----------------------------------------------------------------------------------
ymir.mk 6599af08e5a9f42dc54d496667c9a763f06e8a47 # Version: Commits on Oct 01, 2025
-----------------------------------------------------------------------------------
feat(debug): Added VDP2 CRAM palette viewer/editor,

----------------------------------------------------------------------------------------------
img-gpu-powervr.mk b60da16be1b36453aa46599889c28310fd148fb8 # Version: Commits on Apr 28, 2025
----------------------------------------------------------------------------------------------
Merge branch 'CR_17892_ethercat_ziv.xu' into 'jh7110-devel'

CR_17892_ethercat_ziv.xu

See merge request sdk/soft_3rdpart!95,

--------------------------------------------------------------------------------------------
sdl12-compat.mk 6152f8de38fa9188e280dbbba5d37b15a3a499b5 # Version: Commits on Sept 30, 2025
--------------------------------------------------------------------------------------------
Fixed style,

---------------------------------------------------------------------------------------
aic8800.mk 2bf2dc64bedaf3f0fcbcc206125afa5da8b3835b # Version: Commits on Sept 30, 2025
---------------------------------------------------------------------------------------
feat: release 4.0+git20250410.b99ca8b6-3,

------------------------------------------------------------------------------------
box64.mk d547fe4d372fb8249b0f8874a87088849c707f1d # Version: Commits on Oct 01, 2025
------------------------------------------------------------------------------------
[ARM64_DYNAREC] Small improvment on some invalid opcode handling,

---------------------------------------------------------------------------------------
corsixth.mk 47c2bd25f05412bb3ac2d06b48844b35b21d0a08 # Version: Commits on Oct 01, 2025
---------------------------------------------------------------------------------------
Epidemic. Vaccination cursor does not work fix (#3052)

* epidemic vaccination cursor does not work

* epidemic vaccination cursor does not work fix. improvements by review.

* epidemic vaccination cursor does not work fix.  lua tests error resolving.

* epidemic vaccination cursor does not work fix. improvements by review.,

------------------------------------------------------------------------------------------
devilutionx.mk 607aaf82045fd1c94e8b4cdb41432cabd510908d # Version: Commits on Oct 01, 2025
------------------------------------------------------------------------------------------
Update Spanish translation,

--------------------------------------------------------------------------------------
eduke32.mk 64cac0216881b39728d236728b38853afaa2a44f # Version: Commits on Sep 29, 2025
--------------------------------------------------------------------------------------
Update credits

---------------------------------------------------------------------------------------
etlegacy.mk 4e833f2626f8ba2514f8f4748b0fcba717e6e567 # Version: Commits on Oct 01, 2025
---------------------------------------------------------------------------------------
ui: Fix ctype usage. (#3199)

The argument of these functions is of type int, but only a very

restricted subset of values are actually valid.  The argument must either

be the value of the macro EOF (which has a negative value), or must be a

non-negative value within the range representable as unsigned char.

Passing invalid (negative char) values leads to undefined behavior.

This fixes a segfault on startup on NetBSD 11.,

--------------------------------------------------------------------------------------------
jazz2-native.mk b32f05bb3715df69d32cc73fd5e8f734bff334ad # Version: Commits on Sept 30, 2025
--------------------------------------------------------------------------------------------
Minor fixes,

----------------------------------------------------
nblood.mk r14283 # Version: Commits on Sept 30, 2025
----------------------------------------------------
-

---------------------------------------------------------------------------------------
stalker.mk 9ccefbb6cb72315cf7cc82c2f2096a1140d31094 # Version: Commits on Sept 29, 2025
---------------------------------------------------------------------------------------
build(deps): bump Externals/imgui from `28837ec` to `1f020e5` (#1959)

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>,

-----------------------------------------------------------------------------------------
supertux2.mk 53ebcfcba988dd46071df2903b56bfcf81bdc8ae # Version: Commits on Sept 30, 2025
-----------------------------------------------------------------------------------------
Fix MacOS executable name,

---------------------------------------------------------------------------------------
evsieve.mk b67f32c355eed2e9424aa26d10870700d6c2b3ff # Version: Commits on Sept 30, 2025
---------------------------------------------------------------------------------------
Update libevdev bindings.,

----------------------------------------------------------------------------------------
mangohud.mk b0586184e88221a12dcf7972218c9a467974eb9f # Version: Commits on Sept 28, 2025
----------------------------------------------------------------------------------------
params: use atomic in most places,

-------------------------------------------------------
rgbds.mk v1.0.0-rc2 # Version: Commits on Sept 30, 2025
-------------------------------------------------------
Release v1.0.0-rc2,

----------------------------------------------------------------------------------------
retroarch.mk 5ace54acc403b75c7bed0757c9dfb287cd08a4c0 # Version: Commits on Oct 01, 2025
----------------------------------------------------------------------------------------
Fetch translations from Crowdin,

----------------------------------------------------------------------------------------
doomretro.mk eb760ea9d244385b0c17e581d8e8ac5446d65f45 # Version: Commits on Oct 01, 2025
----------------------------------------------------------------------------------------
Further work on enhancements to player's path in automap,

--------------------------------------------------------------------------------------
gzdoom.mk f4cba8e3a23584743b988b1ed6fc6cea53c9575d # Version: Commits on Sept 28, 2025
--------------------------------------------------------------------------------------
Update text,

------------------------------------------------------------------------------------------
xash3d-fwgs.mk 7266c5469a83b406b0533b2ff7f12d6ba9c665cc # Version: Commits on Oct 01, 2025
------------------------------------------------------------------------------------------
ci: upgrade to latest SDL2 release,

-----------------------------------------------------------------------------------------------
libretro-bennugd.mk be9382f6e1f5fe2632db85ce102434c34b9b576a # Version: Commits on Oct 01, 2025
-----------------------------------------------------------------------------------------------
workaround for bug in zig causing windows crosscompile with zig to fail,

---------------------------------------------------------------------------------------------
libretro-citra.mk d6f50f7e03901bf2e6958b42d4737f8137abaae8 # Version: Commits on Oct 01, 2025
---------------------------------------------------------------------------------------------
libretro: on ios also turn off shader jit if unavailable,

-------------------------------------------------------------------------------------------------------
libretro-doublecherrygb.mk d5e0e0d31a8a09f4e3135089668433a50034a36b # Version: Commits on Sept 30, 2025
-------------------------------------------------------------------------------------------------------
Merge branch 'perfomance/improvements',

----------------------------------------------------------------------------------------------
libretro-fbneo.mk 9726100ba22a558290860a2648e1e6a8b8719478 # Version: Commits on Sept 30, 2025
----------------------------------------------------------------------------------------------
(libretro) update files,

--------------------------------------------------------------------------------------------------
libretro-gearcoleco.mk 51a90ea44632ccd17d74653b7f18197d84b31888 # Version: Commits on Oct 01, 2025
--------------------------------------------------------------------------------------------------
Update artifact name,

-------------------------------------------------------------------------------------------------
libretro-geargrafx.mk 8fb3d8869ad804713634d39fd046fb08c92edd72 # Version: Commits on Oct 01, 2025
-------------------------------------------------------------------------------------------------
Update artifact name,

--------------------------------------------------------------------------------------------------
libretro-gearsystem.mk 9c310da50d9bba03e6b694b68c6b5a2744e045e7 # Version: Commits on Oct 01, 2025
--------------------------------------------------------------------------------------------------
Update artifact name,

------------------------------------------------------------------------------------------------------
libretro-mame2003-plus.mk 59b8a9fb06a47a3ce6aecd09b07f3f001e3d9b08 # Version: Commits on Sept 30, 2025
------------------------------------------------------------------------------------------------------
Update segas32.c,

-------------------------------------------------------------------------------------------------
libretro-panda3ds.mk 2bedba53d33727e3820aed91d39656761f840206 # Version: Commits on Sept 29, 2025
-------------------------------------------------------------------------------------------------
Update .clang-format,

-----------------------------------------------------------------------------------------------
libretro-ppsspp.mk 2f4b1adc98d36a4d3fdd0a413d65a7a0b306ed4c # Version: Commits on Sept 30, 2025
-----------------------------------------------------------------------------------------------
Merge pull request #20845 from hrydgard/checkbox-fix

Fix the OnClick behavior on checkboxes, oops. Fixes cheats.,

--------------------------------------------------------------------------------------------
libretro-ps2.mk 9485a53fa5aa2bff17e04518116107f81a8c82e3 # Version: Commits on Sept 28, 2025
--------------------------------------------------------------------------------------------
Show git version hash,

----------------------------------------------------------------------------------------------
libretro-tic80.mk 463d8b14effb3a90c345fc15a14da297deb79773 # Version: Commits on Sept 29, 2025
----------------------------------------------------------------------------------------------
sokol branch 258 tile esc key mouse flicker fix (#2849),

---------------------------------------------------------------------------------------------
libretro-vba-m.mk 68e7d98b8503cbbe903c1b82215ce49ee03ef3b6 # Version: Commits on Oct 01, 2025
---------------------------------------------------------------------------------------------
Fix pause when inactive for new wxWidgets

Handle the Iconize event to pause when \pause when inactive\ is enabled,

to work around a change in recent versions of wxWidgets that ignore this

unfocus event.

Fix #1494.

Signed-off-by: Rafael Kitover <[email protected]>,

--------------------------------------------------------------------------------------------
slang-shaders.mk cc9d2d31e7e5ed71b4e6a34234cac2fff7774baf # Version: Commits on Oct 01, 2025
--------------------------------------------------------------------------------------------
remove top field first from 240p codepath

it was causing a black screen on 240p content by treating the entire screen like a scanline gap. That is, increasing the 'interlacing scanline bright %' value made the image visible again. This shouldn't even be involved in the 240p codepath, AFAICT, so I'm taking it out.,
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

3 participants