You'll really need to have an idea of what constitutes a "rating", to you, in your application.
The rating of a component is one of those things that's often misunderstood or taken out of context. If, say, a resistor is rated 10W, and you apply 10.1W, is it suddenly going to burst into flames? Of course not.
Will its resistance go out of spec? Maybe, but only if your test conditions are identical to those used for picking a value and calling it the "rating".
Maybe you don't actually care if its resistance varies by 1.2% instead of 1%, and it'll continue to work perfectly well for the rest of your natural life under those conditions. Maybe in your application there's a heat sink or fan which means it'll run cooler, and can therefore handle more power before something bad happens - and it is, of course, completely up to you to determine what that "something" is.
Or maybe the manufacturer's rating represents the power level at which it doesn't
quite catch fire under lab conditions, in which case you might want to choose a different figure anyway.
If you want to truly understand what a resistor "can handle", you first need to define "handle", and what constitutes a failure to "handle" that amount of power.
You could choose an arbitrary surface temperature under some equally arbitrary test conditions, or perhaps you could pick a change in resistance from its nominal value at 20C.
If you're concerned about long term reliability, you might instead pick a criterion which relates to thermal stress cracks forming over repeated heating and cooling cycles, which would be much harder to test but no less valid, and probably something you really would have to obtain from the manufacturer as they'd be much better equipped to carry out that kind of testing than you would.
Whatever you choose, be sure to pick a way of determining the rating that actually applies to how you're going to use them, and what failure mode concerns you.
And then subtract 30%, just to be on the safe side
