Commit dfb3f7da authored by Alex Kodat's avatar Alex Kodat Committed by Commit Bot

[cpu-profiler] Ensure sampled thread has Isolate lock under Windows

While the sampler checked if the sampled thread had the Isolate locked
(if locks are being used) under Linux, the check was not done under
Windows (or Fuchsia) which meant that in a multi-threading application
under Windows, thread locking was not checked making it prone to seg
faults and the like as the profiler would be extracting info from a
heap in motion. The fix was to move the lock check into CpuSampler
and Ticker (--prof) so all OSes would do the correct check.

The basic concept is that on all operating systems a CpuProfiler, and
so its corresponding CpuCampler, the profiler is tied to a thread.
This is not based on first principles or anything, it's simply the
way it works in V8, though it is a useful conceit as it makes
visualization and interpretation of profile data much easier.

To collect a sample on a thread associated with a profiler the thread
must be stopped for obvious reasons -- walking the stack of a running
thread is a formula for disaster. The mechanism for stopping a thread
is OS-specific and is done in sample.cc. There are currently three
basic approaches, one for Linux/Unix variants, one for Windows and one
for Fuchsia. The approaches vary as to which thread actually collects
the sample -- under Linux the sample is actually collected on the
(interrupted) sampled thread whereas under Fuchsia/Windows it's on
a separate thread.

However, in a multi-threaded environment (where Locker is used), it's
not sufficient for the sampled thread to be stopped. Because the stack
walk involves looking in the Isolate heap, no other thread can be
messing with the heap while the sample is collected. The only ways to
ensure this would be to either stop all threads whenever collecting a
sample, or to ensure that the thread being sampled holds the Isolate
lock so prevents other threads from messing with the heap. While there
might be something to be said for the "stop all threads" approach, the
current approach in V8 is to only stop the sampled thread so, if in a
multi-threaded environment, the profiler must check if the thread being
sampled holds the Isolate lock.

Since this check must be done, independent of which thread the sample
is being collected on (since it varies from OS to OS), the approach is
to save the thread id of the thread to be profiled/sampled when the
CpuSampler is instantiated (on all OSes it is instantiated on the
sampled thread) and then check that thread id against the Isolate lock
holder thread id before collecting a sample. If it matches, we know
sample.cc has stop the sampled thread, one way or another, and we know
that no other thread can mess with the heap (since the stopped thread
holds the Isolate lock) so it's safe to walk the stack and collect data
from the heap so the sample can be taken. It it doesn't match, we can't
safely collect the sample so we don't.

Bug: v8:10850
Change-Id: Iab2493130b9328430d7e5f5d3cf90ad6d10b1892
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/2377108Reviewed-by: 's avatarPeter Marshall <petermarshall@chromium.org>
Commit-Queue: Peter Marshall <petermarshall@chromium.org>
Cr-Commit-Position: refs/heads/master@{#69623}
parent 61216077
......@@ -73,6 +73,9 @@ class ThreadManager {
bool IsLockedByCurrentThread() const {
return mutex_owner_.load(std::memory_order_relaxed) == ThreadId::Current();
}
bool IsLockedByThread(ThreadId id) const {
return mutex_owner_.load(std::memory_order_relaxed) == id;
}
ThreadId CurrentId();
......
......@@ -238,7 +238,6 @@ void SamplerManager::DoSample(const v8::RegisterState& state) {
Isolate* isolate = sampler->isolate();
// We require a fully initialized and entered isolate.
if (isolate == nullptr || !isolate->IsInUse()) continue;
if (v8::Locker::IsActive() && !Locker::IsLocked(isolate)) continue;
sampler->SampleStack(state);
}
}
......
......@@ -18,6 +18,7 @@
#include "src/diagnostics/perf-jit.h"
#include "src/execution/isolate.h"
#include "src/execution/runtime-profiler.h"
#include "src/execution/v8threads.h"
#include "src/execution/vm-state-inl.h"
#include "src/handles/global-handles.h"
#include "src/heap/combined-heap.h"
......@@ -881,7 +882,8 @@ class Ticker : public sampler::Sampler {
Ticker(Isolate* isolate, int interval_microseconds)
: sampler::Sampler(reinterpret_cast<v8::Isolate*>(isolate)),
sampling_thread_(
std::make_unique<SamplingThread>(this, interval_microseconds)) {}
std::make_unique<SamplingThread>(this, interval_microseconds)),
threadId_(ThreadId::Current()) {}
~Ticker() override {
if (IsActive()) Stop();
......@@ -903,6 +905,9 @@ class Ticker : public sampler::Sampler {
void SampleStack(const v8::RegisterState& state) override {
if (!profiler_) return;
Isolate* isolate = reinterpret_cast<Isolate*>(this->isolate());
if (v8::Locker::IsActive() &&
!isolate->thread_manager()->IsLockedByThread(threadId_))
return;
TickSample sample;
sample.Init(isolate, state, TickSample::kIncludeCEntryFrame, true);
profiler_->Insert(&sample);
......@@ -911,6 +916,7 @@ class Ticker : public sampler::Sampler {
private:
Profiler* profiler_ = nullptr;
std::unique_ptr<SamplingThread> sampling_thread_;
ThreadId threadId_;
};
//
......
......@@ -11,6 +11,7 @@
#include "src/base/template-utils.h"
#include "src/debug/debug.h"
#include "src/execution/frames-inl.h"
#include "src/execution/v8threads.h"
#include "src/execution/vm-state-inl.h"
#include "src/libsampler/sampler.h"
#include "src/logging/counters.h"
......@@ -28,12 +29,16 @@ class CpuSampler : public sampler::Sampler {
public:
CpuSampler(Isolate* isolate, SamplingEventsProcessor* processor)
: sampler::Sampler(reinterpret_cast<v8::Isolate*>(isolate)),
processor_(processor) {}
processor_(processor),
threadId_(ThreadId::Current()) {}
void SampleStack(const v8::RegisterState& regs) override {
Isolate* isolate = reinterpret_cast<Isolate*>(this->isolate());
if (v8::Locker::IsActive() &&
!isolate->thread_manager()->IsLockedByThread(threadId_))
return;
TickSample* sample = processor_->StartTickSample();
if (sample == nullptr) return;
Isolate* isolate = reinterpret_cast<Isolate*>(this->isolate());
sample->Init(isolate, regs, TickSample::kIncludeCEntryFrame,
/* update_stats */ true,
/* use_simulator_reg_state */ true, processor_->period());
......@@ -46,6 +51,7 @@ class CpuSampler : public sampler::Sampler {
private:
SamplingEventsProcessor* processor_;
ThreadId threadId_;
};
ProfilingScope::ProfilingScope(Isolate* isolate, ProfilerListener* listener)
......
......@@ -221,6 +221,17 @@ void LocalContext::Initialize(v8::Isolate* isolate,
isolate_ = isolate;
}
void LocalContext::AddGlobalFunction(const char* name,
v8::FunctionCallback callback) {
v8::Local<v8::Context> context = this->local();
v8::Local<v8::FunctionTemplate> func_template =
v8::FunctionTemplate::New(isolate_, callback);
v8::Local<v8::Function> func =
func_template->GetFunction(context).ToLocalChecked();
func->SetName(v8_str(name));
context->Global()->Set(context, v8_str(name), func).FromJust();
}
// This indirection is needed because HandleScopes cannot be heap-allocated, and
// we don't want any unnecessary #includes in cctest.h.
class InitializedHandleScopeImpl {
......
......@@ -317,10 +317,14 @@ class LocalContext {
v8::Context* operator*() { return operator->(); }
bool IsReady() { return !context_.IsEmpty(); }
v8::Local<v8::Context> local() {
v8::Local<v8::Context> local() const {
return v8::Local<v8::Context>::New(isolate_, context_);
}
v8::Isolate* isolate() const { return isolate_; }
void AddGlobalFunction(const char* name, v8::FunctionCallback callback);
private:
void Initialize(v8::Isolate* isolate, v8::ExtensionConfiguration* extensions,
v8::Local<v8::ObjectTemplate> global_template,
......
......@@ -3096,6 +3096,90 @@ TEST(MultipleIsolates) {
thread2.Join();
}
void ProfileSomeUnlockingCode(v8::Isolate* isolate) {
v8::CpuProfiler* profiler = v8::CpuProfiler::New(isolate);
v8::Local<v8::String> profile_name = v8_str("1");
profiler->StartProfiling(profile_name);
const char* source = R"(
function bar(depth) {
if (depth == 0) {
YieldIsolate();
} else {
depth += bar(--depth);
}
return depth;
}
function foo() {
const startTime = Date.now();
let x = 0;
for (let i = 0; i < 1e6; i++) {
for (let j = 0; j < 100 ; j++) {
x = i * j;
}
bar(Math.trunc(10 * Math.random()));
if ((Date.now() - startTime) > 50) break;
}
return x;
}
foo();
)";
CompileRun(source);
profiler->StopProfiling(profile_name);
profiler->Dispose();
}
class UnlockingThread : public v8::base::Thread {
public:
explicit UnlockingThread(LocalContext* context)
: Thread(Options("UnlockingThread")), context_(context) {}
void Run() override {
v8::Isolate* isolate = context_->isolate();
v8::Locker locker(isolate);
v8::Isolate::Scope isolate_scope(isolate);
v8::HandleScope scope(isolate);
v8::Local<v8::Context> context = context_->local();
v8::Context::Scope context_scope(context);
ProfileSomeUnlockingCode(isolate);
}
private:
LocalContext* context_;
};
// Checking for crashes with multiple thread/single Isolate profiling.
TEST(MultipleThreadsSingleIsolate) {
v8::Isolate::CreateParams create_params;
create_params.array_buffer_allocator = CcTest::array_buffer_allocator();
v8::Isolate* isolate = v8::Isolate::New(create_params);
{
v8::Locker locker(isolate);
v8::Isolate::Scope isolate_scope(isolate);
v8::HandleScope scope(isolate);
LocalContext context(isolate);
context.AddGlobalFunction(
"YieldIsolate", [](const v8::FunctionCallbackInfo<v8::Value>& info) {
v8::Unlocker(info.GetIsolate());
});
UnlockingThread thread1(&context);
UnlockingThread thread2(&context);
CHECK(thread1.Start());
CHECK(thread2.Start());
// For good measure, profile on our own thread
ProfileSomeUnlockingCode(isolate);
{
v8::Unlocker unlocker(isolate);
thread1.Join();
thread2.Join();
}
}
isolate->Dispose();
}
// Tests that StopProfiling doesn't wait for the next sample tick in order to
// stop, but rather exits early before a given wait threshold.
TEST(FastStopProfiling) {
......
......@@ -331,7 +331,7 @@ class ReadStringVisitor : public TqObjectVisitor {
Isolate::FromRoot(GetIsolateRoot(heap_addresses_.any_heap_pointer)),
resource_data));
#else
uintptr_t data_address = reinterpret_cast<uintptr_t>(resource_data);
uintptr_t data_address = static_cast<uintptr_t>(resource_data);
#endif // V8_COMPRESS_POINTERS
if (done_) return;
ReadStringCharacters<TChar>(object, data_address);
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment