Skip to content

"Flaky" attribute for tests that are flaky in various environments #8237

@analogrelay

Description

@analogrelay

Tracking work:

public static class HelixQueues
{
    // Must be const because they are used in attributes!
    public const string All = "All";
    public const string Debian8 = "...";
    // ...
}

public sealed class FlakyAttribute
{
    // Required for code inspection analytics (verifying that the issue remains open, etc.)
    public string GitHubIssueLink { get; }

    // Specific helix queues on which this test is deemed "flaky"
    // If not specified or 'all', implies all queues
    // If 'none', implies that test runs on all queues
    // Otherwise, this is a semi-colon-delimited list of queues
    public string OnHelixQueues { get; set; } = "all";

    // Indicates if this test is flaky in AzDO
    public bool OnAzDO { get; set; } = true;

    public FlakyAttribute(string gitHubIssueLink) { ... }
}

Usage examples:

  • [Flaky("...")] - This test is always flaky
  • [Flaky("...", OnHelixQueues = "none")] - This test is flaky on AzDO but never on Helix
  • [Flaky("...", OnAzDO = false)] - This test is flaky on Helix but never on AzDO
  • [Flaky("...", OnHelixQueues = "Debian8.whatchamajigger...;Ubuntu.CromulantCrux...")] - This test is flaky on AzDO and specific Helix queues but never on the other Helix queues
  • [Flaky("...", OnHelixQueues = "Debian8.whatchamajigger...;Ubuntu.CromulantCrux...", OnAzDO = false)] - This test is only on specific helix queues and never flaky on the other queues and AzDO

The idea being that the attribute defines the environments in which the test is flaky and the tooling for those builds will sequester the test as necessary.

Discussions are open on making sure these properties are clear and understandable :). My goal was to say that by default Flaky indicates the entire test is flaky in all environments and then the other attributes can be used to "loosen" the requirements.

The implementation is still somewhat TBD (I'll be playing with this today) but the idea is this:

  • The properties of the attribute determine if the flaky xunit trait will be applied
  • The build script will run two passes for each project:
    • One excluding the flaky trait
    • One including the flaky trait which will ignore the exit code, but still record the results.

This all depends on the infrastructure supporting what I want to do here, but I think we can get away with not having a separate AspNetCore-flaky-ci run :).

I think this can live in https://github.com/aspnet/Extensions/tree/master/src/TestingUtils/Microsoft.AspNetCore.Testing and be accessible to everyone.. or we can use a shared-source file like we do with SkipOnHelixAttribute today.

@Eilon @muratg @mkArtakMSFT @ajcvickers @HaoK @ryanbrandenburg @dougbu (maybe we need an aspnet/engineering GitHub team ;))

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-infrastructureIncludes: MSBuild projects/targets, build scripts, CI, Installers and shared framework

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions